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Quantum systems carry information. Quantum theory supports at least two distinct kinds of 
information (classical and quantum), and a variety of different ways to encode and preserve in- 
formation in physical systems. A system's ability to carry information is constrained and defined 
by the noise in its dynamics. This paper introduces an operational framework, using information- 
preserving structures to classify all the kinds of information that can be perfectly (i.e., with zero 
error) preserved by quantum dynamics. We prove that every perfectly preserved code has the same 
structure as a matrix algebra, and that preserved information can always be corrected. We also 
classify distinct operational criteria for preservation (e.g., "noiseless", "unitarily correctable" , etc.) 
and introduce two new and natural criteria for measurement- stabilized and unconditionally pre- 
served codes. Finally, for several of these operational critera, we present efficient [polynomial in the 
state-space dimension] algorithms to find all of a channel's information-preserving structures. 

PACS numbers: 03.67.Pp, 03.67.Lx, 03.65Yz, 89.70,+c 



I. INTRODUCTION 

Physical systems can be used to store, transmit, and 
transform information. Different systems can carry dif- 
ferent kinds of information; classical systems carry classi- 
cal information, while quantum mechanical systems can 
carry quantum information. The system's dynamics also 
affect the kind of information that it carries. For ex- 
ample, decoherence pQ can restrict a quantum system to 
carry only classical information (or none at all). This 
suggests that perhaps a quantum system's dynamics can 
select other kinds of information, neither quantum nor 
classical, but something in between. The central result 
of this paper is an exhaustive classification of exactly 
what kinds of information can be selected in this way. 

Preservation of information in physical systems is im- 
portant in several contexts. In communication theory, 
information originates with a sender ("Alice") who ac- 
tively conspires with a receiver ("Bob") to transfer it 
over a communication channel. Computational devices 
require memory registers that can store information in 
the face of repeated noise. Experimental and observa- 
tional sciences require, in a more or less explicit way, the 
transmission of information from a passive system of in- 
terest (perhaps a distant galaxy, or a nanoscale device), 
through a chain of ancillary systems, to an observer. In 
each case, achieving the desired transformation requires 
first that the information be preserved by a noisy dynam- 
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ical process or "channel" - yet, each operational scenario 
poses a subtly different notion of "preserved". 

In this paper we develop a theory that covers all these 
situations in a unified framework. We start by establish- 
ing a general setting for information (and its preserva- 
tion), using codes (Section [TT]) . We state a minimal nec- 
essary condition for information preservation, then prove 
that it is also sufficient (in a particular strong sense), de- 
riving a powerful structure theorem for preserved codes 
(Section [TTT]) . On this foundation, we build a hierarchy 
of d ifferent operational criteria for preservation (Section 
|IV| ). Stricter criteria correspond to additional operational 
constraints - e.g., that information persist for more than 
one application of the noise. On the one hand, some 
of these criteria allow us to make natural contact with 
previously studied approaches to information preserva- 
tion - including pointer states PQ, decoherence-free sub- 
spaces [2 and noiseless subsystems [3H5], and quantum 
error correcting codes [6] - while also proposing a cou- 
ple of new ones, related to what we call "measurement- 
stabilized" and "unconditionally preserved" codes. On 
the other hand, our main contribution is to gather them 
all into a single framework using information-preserving 
structures (IPSs). IPSs classify the kinds of information 
that dynamical processes can preserve. In particular, we 
focus here on perfect IPS, corresponding to zero- error in- 
formation. Finally, we consider how to find these struc- 
tures for a given noisy process (Section [V}. It is NP-hard 
to find a channel's largest correctible IPS, but for stricter 
preservation criteria it can be much easier. We provide 
efficient and exhaustive algorithms to find noiseless, uni- 
tarily noiseless, and unconditionally preserved IPSs. 

Our IPS framework establishes an explicit and rigorous 
connection between perfectly preserved information and 
fixed points of channels. By focusing on fixed points (see 
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also [7]), rather than on the noise commutant, it provides 
a first step toward understanding approximate IPS, mak- 
ing contact with stability results for decoherence-free en- 
codings under symmetry-breaking perturbations [8 , and 
with approximate QEC [9HTT] . Our structure theorem for 
the fixed points of completely positive maps extends pre- 
vious results that apply only to unital processes p~2j [13] , 
or processes with a full-rank fixed state [14]. Our algo- 
rithm for finding noiseless and unitarily noiseless codes 
improves on algorithms that are inefficient (e.g., Refs. 

[T6 ]h or otherwise restricted to purely noiseless infor- 
mation [17] or unital channels [18]. 

Early aspects of this work appeared in Ref. 19. Here, 
we provide more results, full proofs, and detailed discus- 
sion. 



II. PRESERVED INFORMATION 

"What kinds of information can a quantum dynamical 
process preserve?" is a technical question, but one that 
requires a firm conceptual foundation. This section aims 
to provide one. We begin with an operational definition 
of "information," then apply it to quantum theory. We 
use well-known results on the accuracy with which quan- 
tum states can be distinguished to establish a mathe- 
matical framework in which this central question can be 
answered. 

"Information" has a variety of meanings. Any crisp 
definition will inevitably run afoul of some alternative 
usage. Throughout this paper, we will follow this basic 
operational definition: 

Principle 1. Information is a resource, embodied in a 
physical system, that can be used to answer a question. 

A physical system S can carry information. If one 
party (Alice) sends it to another (Bob), then the recipient 
can use it to answer a question. More precisely, posses- 
sion of S gives Bob a higher probability of guessing the 
correct answer. However, if S evolves during transmis- 
sion - i.e., it undergoes a dynamical map £ - then some 
information might be lost. As a result, £(S) may be 
less useful than S. It is not yet clear how to determine 
whether information is "preserved", but two principles 
seem self-evident: 

Principle 2. If nothing happens to a system, then all 
the information in it is preserved. 

Principle 3. If a system evolves as S — » £(S), and £(S) 
is strictly less useful than S in answering some question, 
then some information in S was not preserved. 

These simple criteria bracket the (as- yet undefined) no- 
tion of preservation - of all the information in a system. 
But information can be encoded into one part of a sys- 
tem. Such information may be preserved even if other 
parts are damaged or destroyed. To properly represent 
this notion, we appeal to another self-evident principle: 



Principle 4. If some property or parameter of a system 
is already known to all parties (e.g. Alice and Bob), then 
it carries no useful information. 

For example, if a quantum system S is known to be 
in the state I^X^I, by all parties, then nothing is gained 
by transmitting it. Since a known property of S carries 
no information, disturbing it has no effect on the infor- 
mation embodied in the system. So, we can represent 
the sequestering of information in a very general way by 
stating a promise or precondition, which guarantees cer- 
tain properties of S. Those properties, being already 
known, carry no useful information. Information carried 
by S conditional on the promise can be preserved, even 
if other properties (constrained by the promise) are dis- 
turbed. 

Mathematically, a precondition on S is a restriction of 
its state, to some (arbitrary) subset. We call such a set 
a code. 

Definition 1. A code C for a system S is an arbitrary 
subset of the system's state space. 

Codes carry information. Each system S has a nat- 
ural "maximum code" containing all its possible states. 
Smaller codes for that system carry strictly less infor- 
mation - but may be preserved even when the system's 
maximum code is not. A code that is a strict subset of 
another preserved code is uninteresting, so we will focus 
on maximal preserved codes. 

Definition 2. A preserved code C is maximal if there 
exists no preserved Cbi g D C. That is, if adding any other 
state would render C unpreserved. 

We can narrow our focus even more. If S has two 
preserved codes, Cbig and C sma n, where Cbi g is strictly 
"bigger" than C sma n, then we are not interested in C sma n. 
Cbig is "bigger" than C sma n if it has a proper subset that 
is identical or isomorphic to C sma n. We can make this 
rigorous, but only by borrowing a technical definition 
from the next section (see Definition |4| : 

Definition 3. A preserved code C is maximum if and 

only if there is no preserved Cug such that C is isometric 
to a strict subset C sma n C Cbi g . 

We will generally restrict our attention to maximum 
codes 1 . We need a precise definition of a "preserved" 
code. We begin by adapting Principles [2] and [3] to codes: 

Principle 5. The information in a code C is preserved 
by a dynamical map £ if £ leaves every state in C un- 
changed. 



Graph theorists may recognize this terminology. Maximal and 
maximum codes have the same relationship as maximal and max- 
imum cliques, or independent sets. Note, however, that unlike a 
graph, a channel need not have a unique maximum code. If a 
channel preserves either a quantum bit or a classical trit, they 
are incomparable - neither is bigger than the other. 
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Principle 6. The information in a code C is preserved 
by a dynamical map £ only if £(C) is as useful as C for 
answering any question. 

These are sufficient and necessary (respectively) op- 
erational conditions for preservation. Principle [6] seems 
much weaker than [5] - but we will show that it is actu- 
ally not. If Principle [6] is satisfied, then there is a physi- 
cally implementable recovery operation that restores ev- 
ery code state. The ability to perform this recovery is 
a resource - a reasonable one, but a nontrivial one. We 
will also consider several weaker resources (e.g., restric- 
tions on what recovery operations can be implemented), 
and the corresponding stronger notions of preservation, 
in Section HVl 

This concludes the "philosophical" part of our frame- 
work, and in what follows we will build on these foun- 
dations to establish technical results. Two final points 
deserve mention, however: 

(i) Identifying "information" with codes (arbitrary sets 
of states) is intended to be a very general paradigm. A 
system's state, by definition, specifies everything that can 
be known about that system. Every question that can 
be answered using S boils down to a question about the 
state of <S, and variations in that state (restricted to some 
particular code) encode information. If there are excep- 
tions to this rule - that is, notions of information, consis- 
tent with Principle [I] that cannot be represented using 
codes - then we are not aware of them 2 . An extended 



discussion can be found in Appendix A 2 ) . 

(ii) Our definition of "information" may not appear 
congruent with Shannon's theory of communication [20j 
|2T] . In fact, it is quite compatible. There are, however, 
some subtle differences: as mentioned, we focus on zero- 
error information; furthermore, we consider a single use 
of a communication channel, rather than N uses with 
N — > 00. An extended discussion can be found in Ap- 
pendix [AT 



on density operators. A CPTP map £ can be represented 
in two equivalent ways. In one formulation, the initial 
system Sa comes into contact with an uncorrelated en- 
vironment £"o, they evolve unitarily, and then some part 
Ef of this joint system is discarded 3 , yielding a reduced 
state for the final system Sb- 



Pb = £(pa) = Tv Ef [U (p A p Eo ) U^] . 



(1) 



The other representation of a CP-map is called the 
operator-sum representation: 



p B = £(pa) = y^Kip A K} : 



(2) 



Systems, states, codes, and channels in 
quantum theory 



where the Kraus operators {Ki} satisfy ^ i K\K i = 11. 
This representation is mathematically simpler but less 
physically intuitive (for a complete treatment of CP 
maps, see Refs. [22 | |23|) . Note that in either representa- 
tion, Sa and Sb may be different systems, with different 
Hilbert spaces. However, the special case where they are 
the same is very important - for instance, all continuous- 
time processes are described by such maps - and we will 
often implicitly assume it, dropping A and B subscripts 
and relying on context to illustrate whether "5" refers to 
the channel's input or its output. 

Codes for quantum systems are sets of quantum states, 
e.g. C = {pi . . . pk}. The code represents a promise that 
the system will be prepared in some p £ C. Each dis- 
tinct code represents a potentially distinct kind of infor- 
mation. Note, however, that we are not introducing an 
infinite proliferation of fundamentally different "kinds" 
of information, nor are we suggesting that a qubit carries 
fundamentally different information from a qutrit: Sys- 
tems with isomorphic state spaces carry the same kind 
of information. N qutrits equal log 2 3 qubits, so they 
carry the same kind of information, but more of it. The 
important dividing line is between systems that have no 
asymptotic equivalence, like a qubit and a classical bit 4 . 

Now that we have a well-defined mathematical the- 
ory, we need a mathematical definition of preservation. 



So far, we have used a language consistent with a broad 
range of physical theories. Let us now specialize to quan- 
tum theory. States of quantum systems are represented 
by density operators p, which are positive trace- 1 opera- 
tors on the system's Hilbert space T~L. Quantum dynami- 
cal maps (also known as channels) are described by com- 
pletely positive (CP), trace-preserving (TP) linear maps 



2 A simple and important example is entanglement between *S and 
a reference system 1Z. Though not explicitly mentioned, entan- 
glement is easy to characterize in our setting. If *S and 1Z are 
maximally entangled, then *S can be post-selectively prepared in 
any pure state IV'XV'I by projecting 1Z into some \^'){^'\. En- 
tanglement is preserved if and only if the code containing all of 
these conditional states is preserved. 



3 A technical note is in order here. If the environment Eq is ini- 
tially correlated with the input system Sa-, then the resulting 
dynamics is generally not CP, and so initial decorrelation is a 
common assumption in the theory of open quantum systems. 
For our purposes, it is more than just an assumption. If Sa is 
initially correlated with its environment, then the latter contains 
information about Sa- The system and its environment together 
may contain more information about Sa than does <S^ itself! In 
the course of the ensuing interaction, that information may flow 
back into the system. It is impossible (ill-defined, even) to say 
whether information in Sa has been preserved in such a case, for 
it may have been replaced with information initially residing in 
Eq. Such an interaction is not, in any sense, "noise". 

4 Two systems Sa and Sb have an asymptotic equivalence if there 
is a constant R such that for all e > and TV — > 00, (i) N(R — e) 
copies of Sa is strictly less powerful that N copies of Sb, and 
(ii) N(R + e) copies of Sa is strictly more powerful that N copies 
of Sb- Thus, any two finite non-trivial quantum systems have 
an asymptotic equivalence in this sense. 



4 



Principle [6] uses the very general idea of "questions." A 
simple and well-defined set of questions turns out to be 
sufficient: "Was the system prepared in state p or state 
cr?" Here, p and a are states in the code C. In gen- 
eral, these questions cannot be answered with certainty, 
for most pairs of states are not perfectly distinguishable. 
But if Bob cannot distinguish them as well as Alice, then 
information has been lost. Of course, there may well be 
many other questions that could be asked, but it turns 
out that if these well-defined questions are all preserved, 
then the code can be corrected (and therefore every ques- 
tion must be preserved!) 

Example 1. Suppose that S is a quantum bit. If its dy- 
namics are noiseless, then every state passes unchanged 
through the channel. We can describe the preserved in- 
formation in terms of a code C qu bit that contains all the 
possible states for a qubit. Now, suppose S experiences 
a dephasing channel, which transforms an arbitrary su- 
perposition of the computational states |0) and |1) into a 
mixture, 

£::a|0)+/?|l)^H 2 |0X0| + |^| 2 |lXl|, 
and which maps the Bloch sphere into itself like this: 



The code C qu t>it is no longer preserved. Because the 
two states |±) = ^^"^ are both mapped to pB = 
Bob cannot answer the question "Was S prepared in |+) 
or |— )?" However, the more restricted code C c ut = 
{|0)(0|, |1)(1|} is preserved, for Bob can distinguish be- 
tween these states just as well as Alice. The preserved 
code describes a different kind of information: one clas- 
sical bit. 

Here are some familiar examples of preserved informa- 
tion, represented as codes. 

Example 2. A pointer basis comprises a set of mutu- 
ally orthogonal "pointer states" . . . \iPn)} that are 
unaffected (or "least affected") by noise - as originally 
introduced in the study of quantum measurement and de- 
coherence A pointer basis can be described by the code 
containing all the pointer states (PSs) l^fcX^fcl an d their 
convex combinations. Classical information is stored in 
the index k, but not quantum information, because super- 
positions are not preserved, and thus cannot be included 
in the code. PSs are preserved in the strongest possible 
sense: Every state in the code is a fixed point of E. 

Example 3. A decoherence-free subspace (DFS) is 
an entire subspace of the system's Hilbert space, V ^H, 
which is invariant under the noise Jj^ (see also Zurek's 



prior discussion of "pointer subspaces" \2$). The cor- 
responding code C contains every density operator sup- 
ported on V . Since C includes superpositions of any given 
basis for V , a DFS preserves quantum information, and 
can in principle support encoded quantum computation. 
Like pointer bases, DFSs are preserved in the strongest 
sense (although, especially in the context of Markovian 
dynamics, the definition is commonly relaxed to allow 
unitary evolution, see also JIffi WBj). 



Example 4. A noiseless subsystem (NS) shares with 
a DFS the property that it can store quantum informa- 
tion. Unlike a DFS, an NS can exist even if no pure 
state in % is invariant. According to the original defi- 
nition |3J \4$, it suffices that the noise has a trivial ac- 
tion on a "factor" of H. That is, S supports an NS if 
there exists a subspace %ab C H that can be factored as 
Hab = Ha ®1-Lb, so that for every pair of states pA, Pb 
supported on T~La, T~Lb, respectively, 



£(pA® Pb) = Pa ® p' B , 



(3) 



for some state p' B on %b- Thus, the restriction of 8 to 
V-ab obeys 



£ = ^ A ®£ B , 



(4) 



for some CPTP map on Hb- Since, for every state pab 
supported on Hab, 



tr B £(pAB) = tr B pAB, 



(5) 



it is clear that quantum information is preserved in the 
reduced state of subsystem A. However, it is not im- 
mediately obvious that (as in Examples there is a 
corresponding fixed code for S. In fact, the existence of 
such a code follows from Eq. and the fact that every 
channel Sb has at least one fixed point tb W^j - Thus, 
the code Cns = {pA <S> ^Pa}, where pA is arbitrary 
on %a, but tb is fixed, is invariant under £. 

Example 5. A quantum error correcting code 

(QECC) [6, 28] also preserves quantum information, but 
according to a weaker criterion. A QECC is a subspace V 
for which there exists a physical recovery operation 7Z so 
that (K o £){\i))) = \^) for all \^) e V. As with a DFS, 
the corresponding "correctable code" contains all states 
supported on V . Unlike the previous examples, this code 
is not fixed. However, it is clearly preserved, because V 
can be turned into a DFS by applying 1Z. An "opera- 
tor QECC" IESI is an NS for Ko £. Another variant 
stipulates active intervention before the noise occurs J3j/, 
in which case the code is "protectable" rather than cor- 
rectable JTTjj. While protectable codes will not be further 
discussed in the present work, the notions of protectabil- 
ity and correctability are not fundamentally different and 
may, to a large extent, be viewed as "dual" to one an- 
other, as elucidated in /IT] /. 

The above examples are not exhaustive, but they illus- 
trate the diversity of criteria for "preserved" information. 
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Each example is specified by a different algebraic condi- 
tion, dictated either by operational constraints or by its 
relevance to the task at hand. We hope that unifying 
them will bring clarity to experimental implementations 
of these ideas [3QH32] . 

The key point of our framework, though, is to explore 
beyond these well-known examples. In particular, all 
the situations illustrated above can be described intu- 
itively as "quantum information" or "classical informa- 
tion." What we would like to know is whether more 
exotic codes are possible - whether some weird channel 
can preserve a form of information that is entirely unlike 
a pointer basis, NS, or QECC. We need a rigorous crite- 
rion for preservation of codes, based on Principles [5] and 
[6j Principle [5] is straightforward, but Principle [6] refers 
to any operational task. Our strategy will be to identify 
one particular task - distinguishing between code states. 
Because we focus on just one task, we will obtain a nec- 
essary condition. Having done so, our next challenge will 
be to bring these conditions together. 

B. Single-shot distinguishability, Helstrom's 
theorem, and the 1-norm 

Suppose that Bob has access to a single copy of the 
system <S, and he wishes to guess correctly whether it 
was prepared in state p or state a (both of which are in 
C). He seeks to maximize the probability that his guess 
is correct, and he knows that the prior probabilities of p 
and o~ are (respectively) p and (1 — p). He can measure 
S to help him decide, and the optimal course of action 
is determined by Helstrom's theorem [33] : 

Helstrom's Theorem. Suppose a quantum system 
S was prepared in either in state p or in state a , with 
respective probabilities p and (1 — p). The highest proba- 
bility of guessing correctly which was prepared is obtained 
by measuring the Hermitian operator A p = pp— (1 — p)cr, 
then guessing "p" upon obtaining a result corresponding 
to a positive eigenvalue and "a " in the case of a negative 
eigenvalue. If a zero eigenvalue is obtained, either guess 
is equally good. The success probability is given by 
P H (p,a;p) = ±(1 + ||A p ||i) wh ere || • ||i refers to the 
1-norm, p||i tr\A\ trVA^A. 

The success probability Pjj is a measure of the distin- 
guishability between p and a. It is non-increasing under 
any CPTP map, because the 1-norm is contractive un- 
der CPTP maps [34 . So, in order for {£ (p), £(a)} to 
be as distinguishable as {p, <r}, we require that for ev- 
ery prior probability p, the Helstrom strategy yields the 
same success probability for distinguishing p from a as 
for distinguishing £ (p) from £ (a): 

P H {£{p),£{G-)-p)=P H {p,a-p). 

If Bob needs to distinguish between two sets of states, 
{pk} and {<T/c}, he assigns prior probabilities {pk} and 



{sk} to the {pk} and {cr^}, respectively. Then his task is 
to distinguish 

9 = EfePfc Yl PkPk 

from 

o~ = ^ y^s k CTk, 

where the prior probabilities of p and a are, respectively, 
P = Y,kPk and I - p. 

This measure of distinguishability is, in fact, a metric 
on the space of linear operators. Its preservation implies 
a kind of rigid equivalence, which we make precise with 
the following definition: 

Definition 4. Two codes Ci andC2 are 1 -isometric (or 

just "isometric") to each other if and only if there exists 
a linear 1:1 mapping f : C\ — » C2 such that, for all p,cr 
in the convex closure of C\ and all p G [0, 1], 

\\pf(p)-(l-p)f( (7 )\\ 1 = \\pp-(l-p)a\\ 1 . 

Definition 5. A code C is 1-isometric (or just "iso- 
metric") for a CPTP process £ only if C is isometric to 
£{C). 

So, if a code is isometric for a given map £ , then 
\\p£(p) - (1 - p)£ ((7)11! = \\pp - (1 - p)<7||i for all p,a 
in the convex closure of C and p G [0,1]. A stronger 
characterization is given by the following: 

Definition 6. A code C is fixed by a CPTP channel £ 
if and only if £(p) = p for all p G C. 



C. Criteria for preservation 

We are now in a position to state Principle [5] more 
precisely: 

Strong Condition for Preservation. A sufficient 

condition for C to be preserved by £ is that C be fixed by £. 

The Strong Condition is obviously sufficient, but 
(as demonstrated by error correcting codes) it is not 
necessary for preservation. Principle [6] implies a host 
of necessary conditions - one for every operational 
task. We choose one in particular: We demand that 
£ (p) and £ (a) be just as distinguishable 5 as p and a. 
We also require that questions like "Was S prepared 



5 Note that p and a need not be perfectly distinguishable to start 
with. A QECC contains non-orthogonal states that cannot be 
perfectly distinguished, but they can be distinguished just as well 
after S as before. 
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in one of the states {pi, p 2 , P3 • • •}? or m one of the 
states {<ti, <72, 03 . . .}?" should be preserved as well, so 
convex combinations of code states should maintain their 
pairwise distinguishability. There is nothing inherently 
special about this particular operational task, except 
that it produces a useful and convenient mathematical 
condition: 

Weak Condition for Preservation. A necessary 

condition for C to be preserved by £ is that C be isometric 
forE. 

These two criteria form the foundation of our frame- 
work. To illustrate their application, here are some ex- 
amples both simple and subtle. 

Example 6. Suppose S is a classical system with four 
states labeled {0, 1,2,3} ; each perfectly distinguishable 
from the others. S passes through a channel that maps 
state k randomly to k or k + 1 ( mod 4 ), represented as a 
stochastic map 



(\ o o \\ 
\\ o o 
I \ 



A stochastic map ; s information-preserving properties can 
conveniently be represented by an adjacency graph for the 
input states, where state j is connected to state k if £(j) 
overlaps with £{k). This map's adjacency graph is: 



O O 



6 — 6 



The code C4 = {0, 1,2,3} representing all information 
about S is not preserved, because and 1 are perfectly 
distinguishable, but £(0) and £(1) overlap. A smaller 
code C2 = {0, 2} is preserved, even though neither nor 
2 is a fixed point. The code C' 2 = {1, 3} is also preserved, 
but the union of C2 andC 2 is not preserved. This demon- 
strates that the set of preserved codes is not convex; dis- 
tinct preserved codes may rely on mutually contradictory 
preconditions on S, e.g., "S was prepared in or 2" and 
U S was prepared in 1 or 3. " 



Example 7. Why must distinguishability be preserved, 
not just between code states, but between convex combi- 
nations of them? 

Let £ be a classical stochastic map on three states 
{0,1,2}, which fixes states and 1, but maps 2 —> 1. 
This map "squashes" the classical 3-simplex onto one of 
its sides, as in the figure below. Now, consider a code C 
comprising the states on the thick (red) line in the figure: 
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This code is not preserved by £, because the original code 
has structure that is missing in its image: States not on 
the line between "0" and u l" can be unambiguously dis- 
criminated (with p > 0) from states lying on the line. 
However, there is no way to recover this structure by 
applying another linear map afterward! Still, if we ig- 
nore convex combinations, then all the 1-norm distances 
\\pp — (1 — p)0"||i ; for p,cr G C are in fact preserved by £. 
This is because the best way to distinguish any two states 
in C is to measure vs. {1,2}, and because the channel 
maps 2 ^ 1, it does not actually affect this measurement. 
If we consider convex combinations, however, we see that 
C is not isometric to £(C), resolving the problem. 

Example 8. Why must all the weighted 1-norm dis- 
tances be preserved, rather than just \\p — a\\i ? 

Let 7-Ls = C 3 be the state space of a qutrit. Define £ 
to be the channel that does nothing to the {|0) , |1)} sub- 



space, but maps |2)(2| 



— 



|1)(1|). Now, consider 



a code C comprising all the states of the form 



Span(|0>,|l» 



|2>(2|) 



We can think of this code as the set of states that would be 
prepared by a machine that is supposed to produce qubit 
states in the {|0) , |1)} subspace, but fails 50% of the time 
and produces |2)(2| instead. 

As in Example this code is not preserved by £. 
In this case, the problem is that Alice can check to see 
whether the preparation failed or not, but Bob cannot. As 
before, this intuition is borne out by the fact that no re- 
covery operation exists. However, if we compute the un- 
weighted 1-norm distances \\p— o~\\i, both before and after 
£ is applied, then we find that they are unchanged. Only 
when we require preservation of the weighted 1-norm dis- 
tances ( corresponding to distinguishing states with the aid 
of prior information) , do we correctly derive thatC is not 
preserved. 

As Example [7] demonstrates, it is important that £ pre- 
serves distinguishability not just between states in C, but 
between convex combinations of them. This means that 
we can (without loss of generality) extend C to include 
all states in its convex closure. From now on, we will 
simply assume that any preserved code is convex in this 
sense, in line with [19]. The Weak Condition then has 
a simple geometric interpretation. £ must preserve the 
1-norm distance between any two unnormalized states pp 
and (1 — p)a. This means that the entire convex cone of 



7 



C - that is, the set C + containing xp for all x > and 
p e C - must be isometric to its image £(C+). Two sets 
are isometric if there is a distance-preserving mapping 
(an isometry) between them. Here, the relevant metric 
is the 1-norm distance 

D(A,B) = \\A-B\\ lt 

and £ is the isometry that preserves it. Thus, preser- 
vation requires that the convex cone C + evolves rigidly, 
with respect to the 1-norm distance, under £ . 

Our necessary and sufficient conditions bracket the as- 
yet-vague notion of a code being preserved by a chan- 
nel. Fixedness seems too strong, isometry perhaps too 
weak. One of our main goals in this paper is to derive a 
single, rigorously stated condition for information to be 
"preserved" by a channel. We will eventually do so by 
squeezing the Strong and Weak Conditions together as 
follows: 

Proposition 1. If C is a maximum isometric code for 
£ (i.e., it satisfies the Weak Condition, and there is no 
larger C that satisfies the Weak Condition), then there 
exists a CPTP map 1Z such that 1Z o £ (p) = p for all 
states p G C. 

By proving this proposition, we will demonstrate that 
the strong and weak conditions for preservation are 
equivalent - given the ability to apply a recovery op- 
eration. The proof is somewhat involved. In the next 
section, we will derive a structure theorem for preserved 
codes, explore its consequences, and finally derive Propo- 
sition [l] as as corollary (Corollary [7| of Lemma |6j which 
follows from Theorem [I] Anticipating this sequence of 
derivations, we proffer the following definition of "pre- 
served" now, with the understanding that it will only be 
justified by what follows: 

Definition 7. A code is preserved by a CPTP £ if and 

only if it satisfies the Weak Condition - that is, 

\\S(pp-{l-p)<T)\\i = \\pp-(l-p)<T\\i, 
for all p,cr G C and p G [0, 1] . 

III. THE STRUCTURE OF PRESERVED 
INFORMATION 

In Section [IlJ we stated plausible necessary and suffi- 
cient conditions for a code to be "preserved", and sug- 
gested a formal definition of preservation (conditional 
on some technical results to be proved in what follows). 
Next, we shall build upon this foundation, elucidating 
the structures that follow from it. First, we will prove a 
series of theorems about preserved codes, culminating in 
a structure theorem showing that preserved codes have 
the same "shape" as matrix algebras. This indicates that 
preserved codes are related to algebras, but provides no 
real context for how they are related, nor what role the 



algebra is playing. So, our second task is to analyze the 
underlying IPS. 

Except where explicitly noted, all the proofs of theo- 
rems and lemmas in this section have been deferred to 
Appendix [Bj 

A. The shape of a preserved code 

Suppose that C is a preserved code for £ . Starting from 
Definition [7| what can we derive about C? Quite a lot, 
as it turns out. The following two definitions from Ref. 
[19] will be needed. 

Definition 8. A code C is noiseless for a CPTP £ if 

and only if it is preserved by any convex combination 
]T n q n £ n , with q n >0 and J2 n Qn = 1- 

Noiselessness is stricter than preservation (every noise- 
less code is preserved, but most preserved codes are not 
noiseless), but weaker than fixedness (every fixed code is 
noiseless, but some noiseless codes are not fixed). Noise- 
less codes are special because their states remain distin- 
guishable no matter how many times £ is applied (note 
that only channels whose output space is the same as 
their input space can have noiseless codes). This cap- 
tures the operational significance of fixedness - and as 
we will show below (Lemma [2j, there is a close mathe- 
matical connection between noiseless and fixed codes. 

Definition 9. A code C is correctable for £ if and only 
if there exists a CPTP 1Z such that C is noiseless for 
1Zo£. 

Correctable codes can be made noiseless, by applying 
a suitable correction operation every time £ happens. 
Readers familiar with QEC may worry that our definition 
is slightly different from the usual one, which requires 
that C be fixed by 7£o£, rather than just noiseless. It will 
turn out that our (apparently weaker) condition implies 
the usual one, so we obtain the same result with a weaker 
assumption 6 . We are now in a position to state a key 
theorem: 

Theorem [TJ A [convex] code C is correctable for £ if 
and only if it is preserved by £ . 

Although the full proof is rather technical (see Ap- 
pendix [B| , one aspect is especially useful and interest- 
ing. We prove the theorem by explicitly constructing a 
correction operation for an arbitrary code C. Moreover, 
the correction operation is independent of C's structure, 



6 In the terminology of Ref. 1111 a code C which is fixed by TZ o £ 
is referred to as "completely correctable". That complete cor- 
rectability is in fact equivalent to correctability can be alterna- 
tively established by exploiting the explicit form of 1-isometric 
encodings, see Thm. 4 therein. 
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and depends only on C's support. A code's support is the 
subspace PCH, comprising the union of the supports 
of all p G C. Since the correction only depends on the 
code's support, every code with the same support will be 
corrected by the same operation. Remarkably, this oper- 
ation coincides with the transpose channel introduced in 
Ref. [35j defined as 

£ v = Uo£Uj\f, (6) 

where P is the projector onto V, II(-) = P • P is the 
projection onto V, £^ is the adjoint map of £ , and M is 
a normalization map M{-) = £(P)- 1/2 (-)£(P)- 1/2 . 

This theorem has two consequences. First, it strongly 
suggests that Definition [7] captures the critical notions 
of information preservation. Second, it implies a simple 
corollary: Every preserved code for £ is noiseless for some 
other map 1Z o £ . This connection from preserved to 
noiseless codes is a step toward proving Proposition [T] 
Even more importantly, it will let us derive a structure 
theorem for preserved codes. To do so, we need another 
result. 

Lemma [2j Every noiseless code C for £ is isometric to 
a set of states that are fixed points of £. 

This means that noiseless and fixed codes are geomet- 
rically almost the same. A noiseless code does not have 
to be precisely fixed, but it will always be isometric to 
a fixed code - that is, it will have the same shape. A 
simple example may be in order. 

Example 9. Let £ be a channel on two qubits, labeled A 
and B, that does nothing to A but depolarizes B: 

£(pab) = Tyb(pab) ® 

Qubit A clearly is a NS under £, whose fixed states are 
of the form Cns — Pa ® (f ) B - However, there are other 
noiseless codes. For instance, let C comprise all states of 
the form Pa®\0)(0\b- Qubit B carries no information, so 
£ J s action on it is irrelevant. None of C's distinguisha- 
bility properties are affected by £, even though C is not 
actually fixed. Note, however, that C 's image £(C) is a 
fixed code. Repeated applications of £ map its noiseless 
codes to fixed codes. 

Lemma [2] implies that a channel has a unique maxi- 
mum (largest) noiseless code, and that the latter must 
be isometric to the set of all fixed states: 

Corollary [3| Every maximum noiseless code for a 
channel £ is isometric to the full fixed-point set of £ . 

A channel can have smaller noiseless codes - even max- 
imal ones. Consider the following example: 

Example 10. Let £ be a channel on two qubits, la- 
beled A and B, acting as follows: It measures B in the 
{|0) , |1)} basis; conditional on |0)(0| it does nothing; con- 
ditional on |1)(1|, it dephases A and flips B to the |0) 



state. Every state of the form pA |0)(0|b is a fixed 
point, and so the largest noiseless code encodes a single 
qubit in A, like in Example^ However, there is another 
maximal noiseless code comprising all states of the form 
O|0)(0| + (1 - p)\l)(l\) A <g> It is isometric to a 

strict subset of the fixed points, so it is not a maximum 
code. 

Recall that any preserved code can be made noiseless, 
by applying a suitable recovery map (Thm. [TJ. By com- 
bining this theorem with the corollary to Lemma [2j we 
establish a direct connection between arbitrary preserved 
codes and fixed states of CPTP maps. 

Theorem [4j Every maximum preserved code for a 
CPTP map £ is 1-isometric to the full set of fixed states 
for some other CPTP map 1Z o £ . 

Proof. This follows from combining Lemma [2] with The- 
orem [T] and Definition [9l ■ 

This points the way to the structure theorem we are 
looking for, provided that we can say something about 
the fixed points of the unknown CPTP map lZo£. Quite 
a bit is known about fixed points of CPTP maps. In par- 
ticular, if H is finite-dimensional, and the map is unital 
(meaning that it preserves the identity operator), then 
its fixed points form a matrix algebra [12j [13] . 

A matrix algebra (a.k.a. finite-dimensional C*- 
algebra) is a vector space of complex matrices, closed 
under multiplication and Hermitian conjugation. It fol- 
lows that 

1. The matrices must be square (otherwise they can- 
not be multiplied); 

2. The set of all dxd complex matrices (i.e., operators 
on a d-dimensional Hilbert space T~L) is an algebra, 
denoted Md or Mn henceforth; 

3. The set containing only the dxd identity matrix 
is an algebra, denoted 11^ or H^. 

Happily, these three simple facts are sufficient to describe 
any matrix algebra. The structure theorem [36] for ma- 
trix algebras states that any such matrix algebra A is 
unitarily equivalent to the canonical form: 

A^®M Ak ®KB k , (7) 

k 

where A^ and B k are complex vector spaces of dimension 
dk and n/c, respectively. We will refer to each of the 
subspaces A^^B^ in the direct sum labeled by k as a "fc- 
sector" . Each ^-sector factors into a noiseless subsystem 
(with Hilbert space Ak) and a noise-full subsystem (with 
Hilbert space B k ) 7 Thus, every matrix algebra is built 



7 Note that in the original definition of [3], a decomposition of the 
form given in Eq. |7| is applied to the (associative) error algebra 
as opposed to states, whereby the identification of the noiseless 
factors with Bj~. 
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up out of the two simple components described in points 
2. and 3. above (the algebra of all d x d matrices, and 
the trivial algebra). 

As remarked earlier, the fixed points of a unital map 
form an algebra. Prior to this work (and the results an- 
ticipated in [H]), no such result was known for arbitrary 
non-unital maps. Before stating our main structure the- 
orem, we need to define a couple of terms. 

Definition 10. Consider a matrix algebra A = 
@ k M-A k ® which induces a Hilbert space decom- 

position Ti = ® fc Ak Bk- A distortion map for A is 
a CPTP map V such that, for every X = J2 k MA k ® U# fc 
in A, 

k 

where tj~ is a positive semidefinite matrix on Bk that does 
not depend on MA k • D(A) is a distortion of A. A 
vector space of matrices A is a distorted algebra if it 
is a distortion of some matrix algebra A. 

A distorted algebra is simply an algebra in which each 
identity factor has been replaced with an arbitrary (but 
fixed) matrix r^. A distorted algebra is not an algebra 
under standard matrix multiplication (because r| ^ T&), 
although it is under a suitably redefined matrix multipli- 
cation. More importantly, there exist CP distortion maps 
that reversibly transform A O ^4, simply by changing the 
r/e factors. Thus, A and A are isometric. 

We can now characterize the fixed points of an arbi- 
trary CPTP map and its adjoint (that is, fixed states and 
observables): 

Theorem [H Let £ be a CPTP map on B(H), and £+ 
its adjoint. Let Fix(£) be the fixed points of £, and 
Fix(ft) the fixed points of £^ . Then, 

(i) Let Vo C Ti be the support o/Fix(£). Then Vo is 
an invariant subspace under £. 

(ii) Let £<p be the restriction of £ to Vo, so £<p = IIo o 
£ o IIo ; where TLq projects onto Vo- Then the fixed 
points of £p o form a matrix algebra A. 

(Hi) Fix(£) is a distortion of A. 

(iv) Fix(£t) is a 1:1 extension of A from Vo to Ti. That 
is, for each X G A, there exists precisely one X' G 
Fix(£t) so that X = U(X f ) = P X f P . 

While Theorem [5] is somewhat intimidating (we shall 
use all of its pieces in Section |V|) , the payoff for its com- 
plexity is that it consistently unifies the Schrodinger and 
Heisenberg pictures of information preservation (see also 
Refs. |3j[4j[7j). ^ ne Schrodinger approach involves looking 
at the fixed states in Fix(£). The Heisenberg approach, 
on the other hand, emphasizes observables of the system, 
which evolve according to £^ (since expectation values 
evolve as Tt{X£ (p)} = Tr{E\X)p}). Fixed states of 



£ in the Schrodinger picture translate to fixed observ- 
ables of £t in the Heisenberg picture. Theorem [5] shows 
that both such fixed sets are isometric to the same matrix 
algebra A. This algebra determines the structure of pre- 
served codes, so the two pictures (interpreted correctly) 
yield equivalent characterizations of preserved informa- 
tion. 

Some of the results in Theorem [5] were proved previ- 
ously, in different (though related) contexts. Our char- 
acterization of Fix(£t) [parts (ii) and (iv)] follows, in 
particular, from a classic operator algebra paper by Choi 
and Effros [37]. Their results are substantially more ab- 
stract and less constructive, but Kuperberg subsequently 
applied them to quantum information (see Ref. 33 The- 
orems 2.2 and 2.3). The proofs given here are self- 
contained (and perhaps more accessible to physicists). 

The fact that an arbitrary CPTP map's fixed points 
are isometric to a matrix algebra, together with Theorem 
|4j nails down the structure of every preserved code. If C is 
a preserved code for a channel £ , then it is isometric (i.e., 
rigidly equivalent) to a matrix algebra. Furthermore, £'s 
fixed points are a subspace of matrices that looks very 
much like an algebra - except that each of the identity 
factors 11b k has been replaced by some fixed matrix r^. 

While the domain of £ contains all operators on Ti, 
its physical significance comes from its action on pos- 
itive semidefinite states. Given any algebra A in the 
canonical form of Eq. ^ , we can easily identify the set 
A+ of positive states in A: A+ contains states of the 

form ^2 k Pkpk ® 7 where the {pk} form a probabil- 

ity distribution, and the {pk} are arbitrary states on the 
noiseless factors. 

£'s fixed states (Fix(£) + ) form a very similar set, com- 
prising states of the form ^2 k PkPk ® Tfc, where the {pk} 
and {pk} are probabilities and arbitrary states as above, 
and the Tfc are fixed density matrices determined by £ . 
Any set of fixed states is a fixed code for £ , and Fix(£) + 
is the unique largest fixed code. Lemma [2] implies a re- 
lationship between noiseless and fixed codes, from which 
it follows that: 

Lemma HI Let £ : B(H) B(H) be a CP map with a 
full-rank fixed point, whose fixed points induce (see The- 
orem^ the decomposition 

H = Q)(A k ®B k ). 

k 

Then C is a [convex] maximum noiseless code for £ if 
and only if C comprises all states of the following form 

P = ^2PkpA k ®r k , (8) 

k 

where the pA k are arbitrary states on Ak and each Tk is 
a fixed (i.e., the same for all p) state on Bk- 

Note that the lemma is only proved for channels with 
a full-rank fixed point. We believe that a similar result 
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can be proved for arbitrary channels, but there are some 
tricky details that obscure the main point. We only need 
to apply this result to channels of the form £-p o £, with 
£-p defined in Eq. Each such channel, from B(V) —> 
B(V), is actually unital (since £p o £(P) = P), so it 
has a full-rank fixed point, and Lemma [6] is sufficient to 
characterize its noiseless codes: They are isometric to the 
channel's fixed points, and those have algebraic structure. 

So while a channel £ typically has a lot of noiseless 
codes, they turn out to be trivial variations on a con- 
stant theme. The variation is a gauge - a particular 
state jiB k for each of the noise- full subsystems. The ac- 
tual information is carried by the variation in the code 
states, which differ only on the noiseless factors A^, and 
in the weights pk assigned to the different ^-sectors. This 
suggests an obvious way to turn noiseless codes into fixed 
codes, simply by adjusting the state of the noise-full sub- 
systems. Thus, we can finally justify Proposition [T] with 
the following corollary to Lemma [6j 

Corollary [7| For every maximum preserved code C, 
there exists a CPTP map 1Z such that 1Z o £ (p) = p for 
all states p £ C. 

We have finally proved the central proposition of the 
previous section, justifying our definition of "preserved". 
If and only if a code satisfies Definition [7| there exists a 
recovery operation that makes it into a fixed code, which 
is clearly preserved in the strongest possible sense. How- 
ever, this depends on Bob's ability to apply the neces- 
sary recovery immediately after £ happens! Section IV 
considers the effect of placing operational restrictions on 
what Bob can do, and how this can change the criteria 
for preservation. 

We note in passing that the framework presented by 
Kuperberg in [38] is similar and uses much of the same 
mathematics. However, it only addressed noiseless and 
unitarily noiseless information (a.k.a. infinite- distance 
codes), not correctable information, or the relationship 
between preservation and correctability. 



B. IPSs: The structures that underly preserved 
codes 



Taken together, the results we have presented thus far 
indicate a rigid algebraic structure lurking within each 
CPTP map £ , which constrains the shape of its preserved 
and noiseless codes. The codes themselves are not the 
structure, however. There are many noiseless codes, all 
distortions of the same algebra. What matters is their 
shared structure. In fact, all these noiseless codes are 
manifestations of a unique noiseless IPS underlying the 
channel, which we turn to explore next. We begin with 
an example. 

Example 11. Consider the two-qubit channel of Exam- 
ple [P| which depolarizes qubit B. There is an infinite 
family of maximum noiseless codes for this channel: If 



tb is a valid state for B, then C r = {pA <S> tb V pa] 
is a noiseless code. While distinct, these noiseless codes 
are all equivalent, and share the same recovery operation, 
7Z — 11. Thus, they are all manifestations of the same 
noiseless IPS. 

This example demonstrates a noiseless IPS, but a chan- 
nel can also have correctable codes that are not noiseless. 
However, these codes are noiseless for the appropriate 
1Z o £ , so the preserved codes with a common recovery 
7Z also share a common structure. A channel can have 
multiple preserved IPSs. In a way, each IPS is akin to 
a hole in the wall of noise, through which information 
can (if properly aimed) pass unscathed. The preserved 
codes reflect this structure, but their diversity can also 
obscure it. If we can concisely describe a channel's IPSs, 
we have (for all practical purposes) completely classified 
its preserved codes. 

Let us define "information-preserving structure" more 
precisely. Every maximum preserved code is isometric 
to an algebra, and preserved codes isometric to the same 
algebra are essentially trivial variations on a theme. They 
are manifestations of the same underlying IPS. 

Definition 11. An information- preserving struc- 
ture for a CPTP map £ is an equivalence class of max- 
imum preserved codes for £. Two codes are equivalent 
if they are isometric to the same algebra, and are pre- 
served according to the same operational criterion (e.g., 
Definition^ Defi nitio n^ or one of the other operational 
criteria in Section\IV\) with the same recovery operation. 



The IPS is not itself an algebra. Rather, an IPS is an 
abstract structure (an equivalence class of codes), whose 
properties are defined by an associated algebra. It is 
possible for a channel to have two distinct IPS with the 
same (isomorphic) algebra. 

By looking at the structure theorem for matrix alge- 
bras (Eq. [7]), we can interpret any given IPS. It consists 
of one or more /c-sectors, each of which contains a noise- 
less subsystem supported on A^ and a noise-full subsys- 
tem supported on B^. Any information encoded into the 
Ak factors will be preserved by £ , whereas any informa- 
tion encoded into the factors is irreparably damaged. 
The information-carrying capability of a code is deter- 
mined entirely by its underlying IPS; distinct codes that 
share an IPS are equivalent, carrying the same kind and 
amount of information. 

Example 12. Consider a classical stochastic map on 
four symbols, {0,1,2,3}, which maps each input symbol 
to a mixture of output symbols as follows 

0^{0,1}, 1^{2,3}, 2 ^{0,2}, 3->{l,3}. 

There are exactly two maximal preserved codes for this 
channel, both of which are actually noiseless: {0, 1} and 
{2,3}. They are equivalent, and both described by the 
same (commutative) algebra - but this is merely a coin- 
cidence. The two codes occupy disjoint subspaces of the 
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input, they both get mapped to output states which span 
the entire output space in different ways, they have en- 
tirely different recovery maps, and by changing the chan- 
nel slightly, we can easily eliminate either code without 
affecting the other. They are thus not manifestations of 
the same IPS. 

To make use of an IPS, Alice and Bob use any of the 
equivalent codes associated with that IPS. Each of these 
codes is isometric to the IPS's algebra, so the structure 
of that algebra tells us everything about its information- 
carrying capability. Since the algebra can be decomposed 
according to Eq. ([7|, 



we can represent it concisely by its shape: the vector 
{di, cfe, • • • •> dn} listing the dimensions of the information- 
carrying factors %A k (the noise-full factors are irrele- 
vant). Pictorially: 



a classical bit, represented (up to unitaries) by the al- 
gebra span{1l, a z }; or nothing, represented by the trivial 
algebra {11}. In particular, there are no CP maps that 
single out a rebit (a mythical physical system described 
by a 2-dimensional real Hilbert space). This would corre- 
spond to preserving information on some equatorial plane 
of the Bloch sphere, spanned by a x and a y , while annihi- 
lating information about cr z . But span{a x ,cr y } is not a 
closed algebra, for a x and a y generate the full qubit alge- 
bra. The fact that no CP TP map can annihilate a z while 
preserving cr x and a y is known, in quantum information 
folklore, as the u No-Pancake Theorem". 





The IPS shape characterizes the type and amount of 
information an IPS can carry. A ^-sector with a l~LA k fac- 
tor of dimension dk > 1 can carry quantum information. 
Classical information is carried by the choice between the 
different /c-sectors. Kuperberg, in Ref. |38l described such 
a noiseless IPS as a hybrid quantum memory, capable of 
simultaneously storing or transmitting a certain amount 
of quantum information and a certain amount of classi- 
cal information. The IPS shape provides a very concise 
way of describing the noise-free degrees of freedom within 
a given system's Hilbert space - much more convenient 
than listing the d A real parameters required to specify a 
quantum process on a d-dimensional Hilbert space! 

From a physical standpoint, algebraic structure im- 
poses a very strong constraint on the types of information 
that a quantum process can preserve. A priori, we might 
suppose that any subspace of B(H) could be "superse- 
lected" by some process, however the theorems proved 
above rule out most such possibilities. 

Example 13. Consider a single qubit, with H = C 2 . Its 
dynamics will be described by some CPTP map (or fam- 
ily of them). These dynamics destroy some information 
while preserving other information, a.k.a. dynamical su- 
perselection. Although there are infinitely many differ- 
ent kinds of dynamics, there are only three possible IPSs. 
The dynamics can preserve the full qubit algebra M.2', or 



Our central result might be thought of as a fully general 
No-Pancake Theorem, since it rules out the dynamical 
supers election of all such non-algebraic IPS. 

We can safely talk about "qudits" of information 
within the code, specified by the IPS shape. Each qu- 
dit corresponds to a logical subsystem - a d-dimensional 
Hilbert space within the full Hilbert space, which need 
not correspond to a physical subsystem but is nonethe- 
less an independent quantum degree of freedom. Multi- 
ple qudits in a direct sum represent a classical degree of 
freedom, for while the different terms in the direct sum 
correspond to perfectly distinguishable states, superpo- 
sitions across them are not preserved. We can use these 
rules to exhaustively catalog all the possible degrees of 
freedom (up to unitary rotations) within any given quan- 
tum system. 



C. Different kinds of IPS 

We identified the Weak Condition as the weakest rea- 
sonable condition for information to be preserved. It 
ensures that Bob can in principle restore the system's 
initial state - but, if Bob has limited resources, then he 
may be unable to do so in practice. Still, Bob's resources 
may be sufficient to correct a code that satisfies some 
stronger condition. Each operational constraint on Bob 
defines some condition on C that is necessary and suffi- 
cient for it to be "preserved" in this situation. 

One important example has already appeared, noise- 
less information (Definition [8]). Noiseless codes require 
no correction at all, so noiselessness is a very strong con- 
dition. In Section \LV\ we will consider several other con- 
ditions. Each such condition defines a distinct class of 
IPSs. So amongst one or more preserved IPSs a chan- 
nel may support, one may also be noiseless. A channel's 
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noiseless IPS is unique, because of its relationship to the 
channel's fixed points (see also Section [V] for further dis- 
cussion of this point). 

Most of the commonly studied techniques for informa- 
tion preservation correspond either to a noiseless IPS, 
or to a preserved/correctable IPS. Three of the "canoni- 
cal" structures that we mentioned in Section [XT] — pointer 
bases, DFSs, and NSs - correspond to noiseless IPS. 
Pointer bases have the shape {1,1,1...}, describing a 
complete set of 1-dimensional ^-sectors (both Ak and Bk 
are trivial for all k). A DFS has the shape {d}, describ- 
ing a single /c-sector with a trivial T~LB k - A NS has the 
same shape {<i}, but it corresponds to the factor of a 
single ^-sector with a nontrivial co-factor B^. 

The relationship between a NS defined in the tradi- 
tional way as discussed in Example [4] and a noiseless IPS 
as defined in [19] and in this paper, has some subtleties. 
A noiseless IPS rests upon a family of noiseless codes, 
or sets of states, whereas the traditional definition of a 
NS makes no direct reference to sets of states. The cor- 
respondence between the two frameworks arises because 
Eq. ([5| is satisfied if and only if there exist noiseless 
codes. This does not imply that Eq. ([5| has anything 
directly to do with noiseless codes! In particular, a set of 
states {pab} satisfying Eq. ([5| need not be a noiseless 
code. 

Example 14. Consider a bipartite system AB with 
Hilbert space Hab = H^H, a channel £ that depolarizes 
system B but leaves A untouched, and the set of states 
given by C = (g) for all G H. Since 



n 



dim(ft) ' 



C satisfies Eq. (In fact, every state pab satisfies 

Eq. |5p >y ) Nonetheless, C is not noiseless. Eq. |5p merely 
guarantees that a noiseless code will exist. 

Error-correcting codes are built upon preserved IPSs. 
Most QECCs are subspace codes, so a code with a recov- 
ery operation 1Z is a DFS of1Zo£. While every subspace 
code is associated to an NS of 1Z o £ (as implied by Thm. 
6 in [3 ), an operator code (OQECC) is also an NS of 
7Z o £ , for the same 7Z. In each case, the code is built 
upon the noiseless IPS of 1Z o £, not of £ itself. In fact, 
£ may have no noiseless IPS at all. However, since these 
codes are correctable for £, they are preserved by it, and 
so they are associated with preserved IPSs of £ . 

Example 15. Consider a system of 5 qubits, and a chan- 
nel £ that picks one qubit at random and depolarizes it. 
This is precisely the error model for which the 5- qubit 
QECC was developed pPj, \4^jj, so £ has a 1- qubit pre- 
served IPS. However, it has no noiseless codes at all, 
because repeatedly applying £ will eventually depolarize 
all five qubits with high probability. 

Example [12] demonstrates that a channel can have 
more than one preserved IPS. Each is a noiseless IPS 



for some 1Z o £ (a consequence of Theorem [I]), and may 
be associated with many preserved codes, all of which 
are corrected by the same 1Z. We would like to have a 
procedure for listing, or at least counting, all the IPS for 
a given channel - but unfortunately we do not know how 
to do this. 

What we can say (from Theorem [I]) is that £'s IPSs 
comprise all the noiseless IPSs of 1Z o £ for all CPTP 
maps 1Z. A simpler and stronger characterization follows 
from the structure of the proof. The correction operation 
for a code depends only on the code's support, so every 
code with the same support will be corrected by the same 
operation. This yields a simpler description: £'s IPSs 
comprise all the noiseless IPSs of £ <p o £ for all subspaces 

vcn. 

While this suggests a way of searching for IPSs (just 
try every subspace, one at a time), there are uncountably 
many subspaces to search (see [16]). It may be possible to 
reduce this problem to searching a countable, even finite 
set. Unfortunately, it is not possible to do so efficiently. 
Just finding the largest classical code for an arbitrary 
channel is NP-hard, so listing all its preserved IPS is at 
least this hard. More precisely, let the size of an IPS be 
measured by the total number of perfectly distinguish- 
able states in one of its preserved codes. Then we have 
the following: 

Lemma |gj The problem of finding the largest preserved 
IPS for an arbitrary channel £ : B(Hd) — > B(T-Ld 2 ) that 
maps a d-dimensional system to a d 2 -dimensional system 
is at least as hard as the NP- complete problem MAX- 
CLIQUE. 



IV. OPERATIONAL CONSTRAINTS AND 
PRESERVED CODES 

Our focus thus far has been on a single notion of preser- 
vation. We assumed that Alice and Bob were unlimited 
in their actions (within the laws of physics), and ended 
up with a preservation condition that depended only on 
whether £ actually destroyed some of the information. In 
this section, we will relax this focus, and consider the ef- 
fect of restrictions on the sender and receiver. Bob may 
not want to correct the channel constantly, or he may 
not know how many times £ has been applied. Alice 
may have a faulty encoder - or perhaps she is not even 
cooperative. Operational constraints of this sort lead to 
alternative conditions for preservation. We shall discuss 
some of the most useful and interesting operational con- 
straints, and the corresponding types of IPS. 



A. Infinite-distance IPSs 

Suppose we want to store information in a physical 
system for a time T > 0, during which £ will be applied n 
times. Further, we cannot perform any active operations 
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on the system during this period. Then the information 
carried by a code C remains intact only if C is preserved 
by the channel £ n . If T (or n) is unknown in advance, 
C has to be preserved by all possible powers of £ . One 
example of a channel for which this holds is a unitary 
channel: 

£{■) = U{-)U\ (9) 

for some unitary U . A unitary channel adds no noise 
at all; it just rotates the code around, and the actual 
rotation depends on how many times it is applied. As 
long as we know how many times U has been applied, we 
can recover any initial state by applying U~ n . 

This kind of behavior can be found even in channels 
that are not purely unitary: 

Example 16. Consider a channel on two qubits, labeled 
A and B, which applies a unitary U to qubit A and de- 
polarizes qubit B. The channel is not unitary, for it adds 
entropy to any pure state - but nonetheless, it acts uni- 
tarily on qubit A. The code C = {pA ® (^f-) ^ Pa} is 
preserved by any number of applications of £. 

We shall refer to a code that remains preserved no 
matter how many times £ is applied as unitarily noiseless 
under £ . Formally, we define a unitarily noiseless code 
as inH9l 

Definition 12. A code C is unitarily noiseless under 
a CP TP £ if and only if it is preserved by £ n for any 
n e N. 

Notice that to retrieve the information stored in a uni- 
tarily noiseless code, we need to know the value of n or, 
equivalent ly the length of time T, in order to construct 
the appropriate Helstrom measurement. In the previ- 
ous example, if we lose track of n, then qubit A will get 
dephased in the diagonal basis of U. Ensuring that uni- 
tarily noiseless codes are preserved indefinitely requires 
a good clock. 

Are there codes for which we do not even need a clock? 
Certainly - for instance, a code containing fixed states of 
£ . Such a code is fixed not only by £ , but also by £ n for 
any n, and by any convex combination Qn£ n (where 
{q n } is a probability distribution). So someone ignorant 
of n can describe the process by a mixture of different £ n , 
and information in a fixed code is still preserved! More- 
over, only the information-carrying part of the code needs 
to be invariant under repeated applications, which is the 
operational motivation for noiseless codes (Definition [8]). 

Noiseless and unitarily noiseless information are pre- 
served indefinitely. No matter how many times £ is ap- 
plied, we can still distinguish code states. In classical 
information theory, the number of errors (i.e., bit flips) 
required to transform one code word into another is called 
the distance of the code. Under the more general defini- 
tion of distance introduced by Knill et al. [3] (based on 
defining a single application of £ as an "error"), noiseless 
and unitarily noiseless codes are infinite- distance codes, 



with respect to the noise model defined by £ . Each 
infinite-distance code is a manifestation of an underly- 
ing noiseless or unitarily noiseless IPS. Infinite-distance 
IPSs may be viewed as degrees of freedom into which 
£ introduces no entropy at all, transforming them re- 
versibly (if at all). We do not have to pump entropy out 
of infinite-distance IPS, and so no active error correction 
is required. For this reason, these have also been called 
passive error-correcting codes. 



B. Constraints on the recovery operation 

Suppose that we can do something to the system in 
between applications of £ . This is crucial whenever the 
channel preserves information, but maps it to a part of 
the Hilbert space that is unprotected against further ap- 
plications of £ . Now we must intervene, applying active 
correction to move our precious information back into 
protected sectors, and ensure its continued survival. If 
we can do absolutely anything, then we can correct any 
preserved code (thanks to Theorem [I]). In practice, how- 
ever, we may only be able to do certain operations. Any 
CPTP map can be decomposed into (i) a POVM mea- 
surement, followed by (ii) a conditional unitary that de- 
pends on the outcome of the POVM. This decomposition 
suggests two natural restrictions on 1Z\ It can consist only 
of a measurement, or it can be completely unitary. 



1. Measurement- stabilized codes 

If unitary operations are costly or noisy, but mea- 
surements can be performed relatively quickly, the only 
"corrections" that we can perform effectively are pure 
measurements. For our purposes 8 , a measurement is a 
POVM defined by a set of effects, 

M = {E m }, where ^ E rn = 11. 

m 

The outcome of such measurement is a particular value 
of m, with probability Pr(m) = Tr(£? m p), and a post- 
measurement state 

i i 
p E m pE m , 

i 

, where E m is the unique positive semidefinite square root 
of E m . 



This careful definition may seem pedantic. However, "measure- 
ments" are sometimes defined very generally, with an update rule 
involving any square root of the effect Em. This trivializes our 
distinction between measurements and arbitrary CP-maps. The 
convention we adopt here is known as Liider's Rule, and defines 
the unique minimally disturbing (and maximally repeatable) im- 
plementation of a given measurement. 
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Can we use measurements to correct noise? At first, 
it seems implausible - after all, while a measurement 
provides information, it actually does not do anything. 
However, the existence of unitarily noiseless codes shows 
that passive information gain, such as knowing how many 
times £ has been applied, can be useful. This motivates 
a definition of measurement- stabilized codes, whose infor- 
mation is preserved indefinitely provided that a measure- 
ment is performed after every application of the channel: 

Definition 13. A code C is measurement- stabilized 

for a CP TP map £ if there exists a measurement M. = 
{E m } such that, conditional on any outcome m, C is uni- 
tarily noiseless for M. o £. 

Stabilizer codes for Pauli channels [23 are an exam- 
ple of measurement-stabilized codes. Stabilizer codes di- 
vide the system into two degrees of freedom, the code 
and the syndrome. Measuring the syndrome "collapses" 
the error, revealing which Pauli unitary transformed the 
information-carrying subsystem. In the usual paradigm, 
we would undo this unitary - but this is not actually 
necessary, as long as we keep track of the current "Pauli 
frame" [41] by recording the results of each syndrome 
measurement as the system evolves. 

The key to reconciling the behavior of stabilizer codes 



with Definition 13 is conditioning on the syndrome mea- 
surements. Since each syndrome measurement collapses 
the syndrome subsystem into a particular basis state, we 
can see the overall system's dynamics, conditional on the 
measurement record, as a rather strange time-dependent 
unitary evolution: At each time step, the code subspace 
gets transformed by some Pauli operator Pi , and the syn- 
drome state jumps from |fc) — >> \k-\-l). Since the code 
evolves unitarily at every step, it is unitarily noiseless, 
and the information in it can be recovered at any time. 

At first glance, this may seem trivial, for as we ob- 
served above, any correction operation 1Z can be writ- 
ten as a measurement followed by a conditional unitary. 
So, given a generic correctable code, couldn't we just do 
the measurement, skip the conditional unitary, and keep 
track of which unitary we did not do? This does not 
work in general, because £ may have moved the code to 
a different subspace which is not, itself, a code. Stabilizer 
codes can be measurement-stabilized because they actu- 
ally comprise a large set of preserved codes, and (condi- 
tional on the syndrome measurement) the channel merely 
permutes the codes while transforming them unitarily. It 
is an open question whether all measurement-stabilized 
codes are of this form (that is, a large set of isomorphic 
codes, indexed by a syndrome), or if the above definition 
permits other structures. 



2. Unitarily correctable codes 

In some systems, we have the opposite situation: Mea- 
surements are slow and/or hard, while unitary evolution 
is fast and relatively easy (liquid-state NMR quantum 



computation is an extreme example; most solid-state ar- 
chitectures also fall into this category) . Now we can only 
apply unitary gates after each application of £ . The 
authors of Ref. [29] considered this situation, and de- 
manded that there exist a unitary matrix U on % = 
(H A ®n B )®n c such that tr B {U£ (pab)U^} = tr bPab 
for all pab £ B(J-La ®Hb)- The A subsystem is a uni- 
tarily correctable 9 subsystem (see also [18]). 

Definition 14. A code C is unitarily correctable for a 

channel £ if there exists a unitary correction map U(-) = 
U-W , for some unitary operator U , so that C is noiseless 
for U o £ . 

Unitarily correctable codes are interesting in part be- 
cause £ does not inject entropy into the code states 10 . 
If it did, the error could not be corrected by a unitary 
operation. Kribs and Spekkens considered unitarily cor- 
rectable codes in some detail in Ref. [18] and noted that 
while any preserved code is "unitarily recoverable" - i.e., 
there is a unitary that puts the information back into 
the subsystem where it originated - this need not suf- 
fice to correct the errors, and cooling may be required to 
protect the information against subsequent iterations of 
the noise. Sufficient conditions for unitary correctabil- 
ity have likewise been directly derived from the structure 
of 1-isometric encodings [TT] (see, in particular, Prop. 1 
therein). 

Example 17. Consider two qubits labeled A and B, and 
let £ act as follows: B is measured in the {|0) , |1)} basis; 
If the result was "1 then A is depolarized. Finally, B 
is depolarized. The code C = {pa® |0)(0|b ^ Pa} is pre- 
served. It is unitarily recoverable - in fact, no recovery is 
necessary because the information remains in the A sub- 
system. It is not unitarily correctable, however, because 
unless B is cooled to the |0) state, £ 's next iteration may 
damage the information. 

Kribs and Spekkens also pointed out that, under cer- 
tain circumstances, unitarily correctable codes can be 
found efficiently. This observation is closely related to 
our next topic. 



C. Unconditionally preserved information 

If a code C is preserved, then Bob can distinguish be- 
tween states in C (and their convex combinations) just 



9 The authors of [29] called this "unitarily noiseless", but we be- 
lieve the term "unitarily correctable" is more appropriate. 
10 Actually, it is slightly more technical than this: Given any uni- 
tary correctable code, there is another code associated with the 
same unitarily correctable IPS, into which S does not inject any 
entropy. This is directly related to the fact that a code can be 
noiseless without being fixed - in both cases, repeated applica- 
tion of S, or hi o £, causes the code to converge toward a fixed 
code, whose entropy does not increase thereafter. 
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as well as Alice. So if we want to know "Was the system 
prepared in £ C?", Bob can answer just as well as 
Alice could have, by discriminating \^}(^\ from a con- 
vex combination of all states orthogonal to What he 
cannot do is determine whether the initial state was in C. 
Information is preserved conditional on the system being 
prepared in C, as illustrated by the following example. 

Example 18. Let £ be the following [effectively classi- 
cal] channel from a d- dimensional system to itself. On 
the sub space %d-i spanned by {|0) . . . \d — 2)}, £ acts as 
the identity channel. However, \d — 1) is decohered and 
mapped to the maximally mixed state ^11. 

The code comprising all states on Hd-i is preserved, 
so Bob can distinguish between |0) and any convex combi- 
nation of |1) . . . \d — 2). If the input state was supported 
on Hd-i, Bob can determine whether |0) was prepared. 
Without this promise, however, any measurement result 
on the output is consistent with the input state \d — 1). 

Sometimes, a channel preserves some properties of the 
input state irrespective of what it is. For instance, if £ 
is the identity channel, then Bob can make any measure- 
ment that Alice can. His conclusions from those mea- 
surements do not depend on any prior information about 
the input. The following example is less trivial. 

Example 19. Consider the classical channel whose ac- 
tion is pictorially shown below: 




INPUT OUTPUT 



which corresponds to a stochastic map of the form 
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Bob can measure {1'} vs. {2 / ,3 / ,4 / } ; and from the result 
infer exactly what Alice would have gotten had she mea- 
sured {1,2} vs. {3, 4}. So this property of the input state 
is unconditionally preserved: No matter what the input 
state was, Bob can determine whether it was in {1,2} or 
not. Note that unconditional preservation need not be re- 
lated to noiselessness - applying this channel twice ruins 
the information. 

This illustrates unconditionally preserved information. 
The most natural way to define unconditional preserva- 
tion is not in terms of states or codes, however, but rather 
in terms of measurements. 



Definition 15. Let £ : B(H) —> B(1~L') be a channel, and 
M = {Pi...P n } a projective measurement on Hilbert 
space H (so ^2 k Pk = HJ. Then M. is unconditionally 
preserved by £ if and only if there exists another mea- 
surement M.' — {Qi . . . Q n } on H f such that M.' simu- 
lates M: that is, Ti[Pkp] = Tr[Q &£(/?)] for all density 
matrices p on H. 

This condition on measurements is based in the Heisen- 
berg picture of quantum mechanics, in which states stay 
fixed, but measurements evolve according to £^ . In or- 
der for M to be unconditionally preserved, there must 
be some measurement M.' that evolves into A4. We can 
also if desired define an equivalent condition on states: 

Definition 16. A code C is unconditionally pre- 
served by a channel £ if and only if the Helstrom mea- 
surement for every weighted pair of states pp, qcr in the 
convex closure of C is unconditionally preserved. 

The second definition is strictly more general: Every 
unconditionally preserved measurement M. = {Pi . . . P2} 
can be identified uniquely with a code 



C 



Tr(P„) 



which is unconditionally preserved if and only if A4 is. 
Every classical code whose support is all of H defines a 
single unconditionally preserved measurement. Quantum 
(or hybrid) codes whose support is all of 7~L define en- 
tire algebras of unconditionally preserved measurements. 
Codes restricted to a subspace do not generally corre- 
spond to unconditionally preserved measurements. 

The code associated with a given unconditionally 
preserved measurement spans the entire Hilbert space. 
Therefore, following the proof of Theorem [T] it can be 
corrected using a transpose map £p - where V is the 
entire Hilbert space! Since this statement holds for ev- 
ery unconditionally preserved measurement, we can cor- 
rect every unconditionally preserved code using a single 
unique recovery, which we denote £ : 



£(■) = & (£(]!)-*. £(11)-*). 



(10) 



It follows that every unconditionally preserved measure- 
ment consists of projectors that are fixed points of 
£ o £ . There exists a unique unconditionally preserved 
IPS, which contains all the unconditionally preserved 
codes. Moreover, we can find its structure quite easily 
by constructing and diagonalizing £ o£. Other codes are 
hard to find, precisely because we need to know their 
support V . 

Kribs and Spekkens observed that if £ is unital (that 
is, £(11) = 11), then its unitarily correctable codes are 
fixed points of £ ^ £ . This is an interesting special case of 
unconditional preservation. If C is unitarily correctable, 
then the channel does not add any entropy to it - thus, 
every pure state in the code remains pure. But if £ is 
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imital, it cannot map two orthogonal subspaces to over- 
lapping subspaces of the same size, because this would 
cause a pile-up of probability on the overlapping portion. 
So every unitarily correctable code must be uncondition- 
ally preserved, because no other subspace can be piled 
on top of it in the output space. Finally, for a unital 
channel, £ = £\ so £^ corrects every unconditionally 
preserved code. 

V. APPLICATIONS 

In this section, we present three applications of the 
IPS framework that we have derived. First, we state a 
very simple algorithm that efficiently finds all noiseless 
and unitarily noiseless codes for a given map £ . We then 
present a similar algorithm to find all the uncondition- 
ally preserved codes. Finally, we show how to address 
so-called "initialization-free" DFSs and NSs within our 
framework. 



A. Finding infinite-distance codes 

Our discussion suggests a natural strategy for finding 
all the preserved codes of a channel £ : First, find all 
its preserved IPS; then build codes from the IPS. Un- 
fortunately, there is a potential IPS for each and every 
subspace V C %. So, searching for IPS seems to require 
an exhaustive search over all subspaces of T~L (see [16]). 
We can find some preserved codes by picking particular 
subspaces, but we may not find the largest IPS (or any 
of them). Since the problem is NP-hard, an efficient al- 
gorithm seems unlikely (though it should be noted that 
we have only proven that finding the best classical code 
is NP-hard - other special cases, for instance the largest 
quantum code, might conceivably be easier). 

Let us focus instead on noiseless codes. The noiseless 
IPS of £ is unique, because all the maximum noiseless 
codes are isometric to £'s fixed points. So, to find 
the unique noiseless IPS, we need only determine the 
structure of £'s fixed points. Theorem [5] defines this 
structure, and suggests an efficient algorithm to find it: 

Algorithm for finding noiseless IPS: 

1. Write £ as a d 2 x d 2 matrix, where d is the dimen- 
sion of the Hilbert space. 

2. Diagonalize the matrix, and extract its eigenvalue- 1 
right and left eigenspaces (corresponding to Fix(£) 
and Fix(£t), respectively). 

3. Compute Vo, the support of Fix(£), and project 
Fix(£t) onto Vo to obtain a basis for A. 

4. Find the shape of A. 

In the last step, we need to find the canonical decom- 
position, Eq. ([7]) of a finite-dimensional matrix algebra 



specified as a linear span. This can be done efficiently 
using, for example, the algorithm presented in Ref. [42j 
This canonical decomposition step is also present in ex- 
isting algorithms for finding NSs [I6j [17] . Our algorithm 
improves on previous algorithms by providing a straight- 
forward method of finding A as a linear span. Its hardest 
step is diagonalizing a d 2 x d 2 matrix, which runs in time 
0(d 6 ). As such, it is more efficient than algorithms (such 
as [15j[T6]) that require exhaustive search over states or 
subspaces in H : for these sets grow exponentially in vol- 
ume with d. 

We can generalize this algorithm to find an arbitrary 
channel's unitarily noiseless IPS. Whereas the noiseless 
IPS consists of £'s fixed points - operators X such that 
£{X) = X - the unitarily noiseless IPS consists of rotat- 
ing points - operators X such that £{X) = e l€ ^ x X. 

Definition 17. Let £ : B(H) B(U) be a CPTP map. 
An operator X G B(H) is a unitary eigenoperator of 

£ if and only if £{X) = e icf) X for some (j) G R. The 
rotating points of £ comprise all operators in the span 
of its unitary eigenoperator s. 

Note that a rotating point need not be an eigenop- 
erator - for instance, a linear combination of two uni- 
tary eigenoperators with different phases is a rotating 
point, but not itself an eigenoperator. As an example, 
consider the unitary qubit channel £(p) = e~ l ^ az pe l ^ az . 
The Pauli operators a x and <j y are not eigenoperators, 
but they are rotating points. 

Lemma [9j If C is a maximum unitarily noiseless code 
for a CP map £, then C is isometric to the set of all 
(positive trace- 1) states in the span of the rotating points 
of £. In other words, there exists a map £{ n f such that 
\\p £ in f(p) ~ (1 - p)£ zn f(o-)\\i = \\pp- (1 -p)cr||i for any 
p,a G C, p G [0,1], and £i n f(p) and £i n f(o~) are in the 
span of the rotating points of £. 

We adapt the above algorithm by shifting its focus 
from fixed points to rotating points. It is useful to note 
that the support of the rotating points is the same as the 
support Vo of the fixed points. Therefore, we just need 
to replace step 2 above by the following: 

2'. Diagonalize the matrix, and extract the right and 
left eigenoperators with unit modulus eigenvalues. 
Let Fix(£) (Fix(£t)) be the linear span of the unit- 
modulus right (left) eigenoperators. 

This again runs in time 0(d 6 ) as before. It is (to our 
knowledge) the first efficient algorithm to find unitarily 
noiseless codes for arbitrary channels. We note in passing 
that both algorithms - for finding noiseless and unitarily 
noiseless IPS - rely on the codes having infinite distance, 
so they are unlikely to be adaptable to finding other kinds 
of IPS. 
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B. Finding the unconditionally preserved IPS 

We know that preserved codes are in general hard to 
find, but in the previous section we saw how to take ad- 
vantage of infinite-distance codes' structure to find the 
unique noiseless and unitarily noiseless IPSs. Uncondi- 
tionally preserved IPSs are another special case. A chan- 
nel has a unique unconditionally preserved IPS, and we 
can find it efficiently. The algorithm is extremely simple: 
Construct 



£(■) = & (£(!)-*• £(11)-*) 



(11) 



diagonalize ££ , and extract its fixed points (the 
eigenspace with eigenvalue +1). These will form an alge- 
bra, which defines the IPS we are looking for. However, 
one might reasonably inquire why the unconditionally 
preserved IPS is interesting and useful. 

If we ask "What information is preserved by a given 
channel £?", then one possible answer consists of an ex- 
haustive list of all the channel's preserved IPSs. This is 
somewhat unsatisfactory for three reasons. First, we do 
not know how to find such a list (though we know that 
it is generally hard). Second, it might be very very long, 
even for channels on small systems. Third, the preserved 
codes corresponding to these IPSs represent information 
that could be preserved by the channel, depending on 
what the sender chooses to do, and conditional upon prior 
agreement between sender and receiver. 

Unconditional preservation provides an alternative an- 
swer. Every channel has a unique unconditionally pre- 
served IPS, comprising all the information that is def- 
initely preserved by £ . In the important case where 
the "sender" is a natural process, this IPS represents ev- 
erything that the observer can determine with certainty. 
Any further conclusions are valid only conditional upon 
certain prior assertions about the "distant" system (e.g., 
that its state lay in some subspace V). This interpreta- 
tion alone is sufficient reason to consider the uncondition- 
ally preserved IPS - independent of the happy accident 
that it is unique and easily calculable. 



C. Initialization- free DFS and NS 



As discussed in Section [Til C| DFSs and NSs are man- 
ifestations of £'s noiseless IPS. We can demand further 
operational requirements on a DFS or NS. One partic- 
ular criterion is robustness against initialization errors - 
that is, we demand not only that information encoded 
in the DFS /NS be preserved indefinitely, but also that if 
Alice failed to prepare a state within the DFS/NS, that 
this can be detected by Bob. Such "initialization-free" 
(IF) DFS and NS were first studied in Ref. EH and have 
been further characterized in Ref. 26 in the context of 
Markovian dynamics. Since a DFS is just a NS with a 



trivial noise-full subsystem, we shall focus on IF-NS 11 . 
If we decompose the system's Hilbert space as 

H = (A <g> B) © C, 

and A supports a NS, then we can write an arbitrary 
density operator in the following block form: 



P 



Pab P 
P ] Pc 



(12) 



The NS is said to be perfectly initialized whenever p and 
pc are zero. If, in practice, it is not possible to guarantee 
preparation within A 5, then we need a special kind of 
NS that is insensitive to such initialization errors. The 
NS is initialization-free if the (possibly subnormalized) 
state pab on i^-B satisfies the NS condition of Eq. (J5|, 
even when pc is not zero. In other words, an IF-NS is one 
that is immune to interference coming in from orthogonal 
subspaces of T~L (i.e., states that would not have been 
prepared if the system had been perfectly initialized). 

Our framework, as it turns out, provides a simple and 
elegant condition for initialization-free NSs: An NS is 
IF if and only if it is noiseless and unconditionally pre- 
served. So, we can find a channel's IF noiseless structures 
by intersecting its noiseless IPS and its unconditionally 
preserved IPS. In the remainder of this section, we will 
demonstrate this equivalence. 

Given a Hilbert space T~L and a channel £ , the chan- 
nel's noiseless IPS defines a subspace decomposition T~L = 
Po0fo- Subspace Vo is the support of the noiseless 
IPS. The noiseless IPS also defines a canonical decom- 
position of Vo into /c-sectors (Ak ® Bk), so we write the 
Kraus operators of £ accordingly, as: 



Ki 



o 



C' 



(13) 



Each /c-sector is an invariant subspace. So each NS (Ak) 
is automatically resilient to initialization errors that pre- 
pare states in the wrong ^-sector (but still within Vo). 

However, if faulty initialization puts support on Vo, 
then this error may spill into the noiseless sector. Specif- 



ically, the D[ blocks in Eq. 13 map Vo into Vo, which can 



interfere with information stored in noiseless codes. Since 
every NS is immune to interference from other /c-sectors 
within Vo-, let us consider interference from cPq. 

Consider, for the sake of simplicity, a noiseless IPS 
containing a single k — sector, so Vo — A (g) B (as in 
Eq. (12)). Let P be the projector onto Vo- The Kraus 



11 Ref. 25 actually discusses a more general case, allowing unitary 
evolution of the NS. This is what we call a unitarily noiseless 
code. To be consistent with the usual definition of NS, we use 
the "strict" NS condition given in Eq. j5}, but everything in this 
section can easily be generalized by using a channel's unitarily 
noiseless IPS instead of its noiseless IPS. 
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operators are 



Ki = 



Ai Di 

o a 



(14) 



and if the initial state is p as given in Eq. (12), then the 
final state on Vo — A ® B is 

P£(p)P = J2 A iPABAl + J2DiPcDl 

i i 

+ J2(AipDl+D^Al). (15) 

i 

For perfect initialization, only the first term is present. 
The remaining terms represent interference from faulty 
initialization on Vo- The NS is IF if and only if they 
vanish, which requires 



J2 D iPc Dl = - J2( A iP D l + Ap f A\). 



(16) 



Since pc is positive semi-definite, the left-hand side of 
Eq. (16) is also positive semidefinite. But the right-hand 



side of Eq. (16) must be traceless, because in order for £ 



to be trace-preserving, J2i A\Di = 0, and so 



Tr 



J2(AipDl+D^Al) 



25RTr {y^AlDipt 



0. 



So the left-hand side is positive semidefinite and trace- 
less, which means it vanishes - and so Eq. (16) holds if 
and only if Apc-DJ = for all pc - which implies 
A = for all i. 

This means that in order for an NS whose support 
is Vk — (8) Ac to be IF, the channel must not map 
anything from Vo into Vk- That is, Vk is orthog onal to 
£(pc) f° r every pc > on Vo (and, by Lemma |l.l| in 
Appendix |B 1[ it is sufficient to consider just one full- 
rank pc on Vo). But this is precisely the condition for 
the corresponding code to be unconditionally preserved: 
Bob must be able to determine whether the system was 
correctly initialized, which means that the channel must 
not map any part of Vo back into Vk- 



VI. CONCLUSIONS AND OUTLOOK 

We have presented a framework characterizing the in- 
formation preserved by a quantum process, described by 
an arbitrary CPTP £ map acting on a finite-dimensional 
quantum system. Information is carried by codes; codes 
are preserved if their associated information can be ex- 
tracted after passing through the channel; preservation 
implies correctability. Preserved codes are built upon 
the channel's information preserving structures (IPSs), 
which in turn inherit matrix algebra structure from fixed 
point sets of CPTP maps. This allows for a very elegant 
and concise description of the full information-carrying 



capability of any code. We also discussed several opera- 
tional variations on preservation, with particular atten- 
tion to infinite-distance codes, and applied the theory to 
find all of a channel's noiseless, unitarily noiseless, and 
unconditionally preserved codes. 

A number of important open problems and directions 
for further investigation remain. We have not explicitly 
addressed continuous-time quantum processes. Such a 
process is described by a 1-parameter family {£ t : t > 0} 
of CPTP maps. A special subclass with particular phys- 
ical significance is Markovian noise, where £ (t) = e tC for 
some Liouville semigroup generator C [43 . In principle, 
our definitions of noiseless and unitarily noiseless codes 
extend to the Markovian setting, suggesting connections 
to recent studies of DFSs/NSs under Markovian noise 
(see in particular Refs. |25j [26] [44] [45]) , and to earlier 
approaches such as "damping bases" developed in the 
context of quantum optics [46]. However, we believe it 
will be necessary to extend our notion of correctability to 
address continuous-time QEC, as developed for instance 
in [47]. 

Our analysis has focused on information preservation 
under the uncontrolled ( "free" ) evolution of an open sys- 
tem. The ability to control that system's dynamics while 
it is experiencing noise (rather than correcting the er- 
rors after they occur) raises questions that are inter- 
esting for practical quantum information processing and 
from a control-theoretic perspective. It would be valu- 
able to know how to synthesize dynamics that support 
a given (desired) IPS, using externally applied control, 
much as DFSs/NSs can be engineered using open- loop 
unitary manipulations [48] or closed-loop feedback pro- 
tocols [261 BH • 

Our current framework does not address "post- 
selective" preservation of information, where the infor- 
mation is preserved conditional on a particular measure- 
ment outcome. Another natural direction for generaliza- 
tion is to relax the "zero-error" requirement, looking at 
imperfectly preserved information under CPTP channels 
or more general noisy dynamics. Preliminary investiga- 
tions [10 indicate that partial extensions of some of the 
structures present in the perfect case carry over to the 
approximate case, but a variety of interesting complica- 
tions arise. A final question that deserves further inves- 
tigation arises when the information-carrying system is 
not initially fully decoupled from its environment. This 
particular kind of initialization error can produce noise 
which cannot be described by CP maps, and its analysis 
must address the influence of (weak) initial correlation 
with the environment on the information [supposedly] 
stored within the system. 
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Appendix A: Our framework for analyzing 
information 

1. Our notion of information: Relation to Shannon 
theory 

The most common technical meaning of "informa- 
tion" comes from Shannon's theory of communication 
[20] [2T| [49] . Here, Alice and Bob are connected by a 
communication channel £ (a dynamical map between in- 
put states and output states), and also have: 

1. A codebook that tells Bob which signals Alice 
might send; 

2. The patience and ability to send signals requiring 
arbitrarily many uses of the channel; 

3. A willingness to tolerate a very small probability of 
failure; 

4. A guarantee that £ will be applied exactly once. 

Although this paradigm is the backbone of both classical 
and quantum information theory, it is not unique. Any 
or all of the above resources may be unavailable: 

• Sometimes there is no codebook restricting the 
possible signals. In scientific applications, the 
source of information is generally a natural phe- 
nomenon rather than a canny and cooperative 
sender. This observational paradigm restricts the 
questions whose answers the receiver can learn. 

• In real-time applications, a signal has to be trans- 
mitted within a strictly limited number (N) of 
channel uses. This eliminates the second resource 
(encoding over arbitrarily many uses), and moti- 
vates single-shot capacity: What can we accom- 
plish with a single use of the channel £® N 7 

• Some applications demand perfect reliability. This 
eliminates the third resource (tolerance of arbitrar- 
ily small failure probability), and yields zero-error 
information theory [50] [51] . 

• Memory devices, which store information rather 
than transmitting it, may violate the guarantee 
that £ is applied exactly once. We may wish our 
information to be preserved for an arbitrary num- 
ber of clock cycles, or £ may be a snapshot of a 
continuous process. When £ may be applied many 



times, we turn to error correction. Correctible in- 
formation requires active correction after each iter- 
ation of £ ; noiseless information persists through 
repeated iterations of £ with no intervention. 

In this paper, we are concerned primarily with iden- 
tifying the kinds of information that can be preserved, 
rather than the rate at which information can be sent or 
stored. So, we focus on zero-error information and the 
single-shot paradigm. This does not really affect the gen- 
erality of our results: Since they apply to arbitrary chan- 
nels, we can discuss £® N for any N. We do not know for 
certain, however, whether tolerating an asymptotically 
small amount of error changes the kinds of information 
that can be preserved by £® N . 

The other two resources (a pre-existing codebook, and 
exact knowledge of £ ) are quite important. They yield 
different preservation criteria, with substantially different 
consequences, and we consider them separately. 

2. On the usefulness and generality of codes 

Our framework for analyzing preserved information re- 
lies on codes to describe different kinds of information. 
A code is an arbitrary set of preparations (states) for 
a physical system <S, representing the alternatives avail- 
able to the sender. Essentially, a code describes a very 
generalized "subsystem" , in which information can be en- 
coded. We settled on this formalism after quite a bit of 
thought and exploration, and expect that some readers 
may seek a more extensive explanation of why we believe 
it is useful, general, and powerful. The most efficient way 
to do so might be to anticipate some potential objections. 

• Using "questions" to define information seems in- 
herently classical, and inadequate to describe quan- 
tum information. The idea of a question, with a 
definite answer, is indeed inherently classical. Hu- 
man beings are unavoidably classical, and as Bohr 
famously insisted [52 , our descriptions and percep- 
tions of Nature are always classical. As such, we 
believe that a precise and general definition of "in- 
formation" must rely on classical concepts. We can 
nonetheless describe quantum information in this 
framework. The difference between a classical bit 
and a quantum bit is that the bit admits just one 
sharp question, "Is the bit or 1?," whereas the 
qubit supports an infinite continuum of inequiva- 
lent sharp questions, "Is the qubit in state or 
state l^)," for every orthogonal basis {1^) , \^±}}- 
By using classical questions as a common denom- 
inator to define both classical and quantum infor- 
mation in the same lingua franca, we have a frame- 
work that is open to novel forms of information - 
rather than begging the question of whether they 
exist. 

• This definition does not seem to capture entangle- 
ment as a form of information - i.e., that £ might 
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preserve entanglement between S and a reference 
system 1Z. Entanglement is a peculiarly quantum 
form of correlation, wherein the state of S is condi- 
tional upon observations on the reference system. 
Projecting 1Z into a state steers [53 S into a 
corresponding p^. It is not difficult to show that 
£ preserves this entanglement if and only if it also 
preserves the code comprising all p^ into which S 
can be steered. Thus, the code paradigm does ad- 
dress entanglement as a form of information. 

• Preserved information should be addressed in the 
Heisenberg picture, by considering preserved ob- 
servables rather than states. In fact, our analy- 
sis proceeds along these lines; we demand that ev- 
ery measurement for distinguishing between code 
states be reproducible on Bob's end. However, the 
code C is a crucial ingredient in defining a kind 
of information, because it determines which mea- 
surements need to be reproducible! Otherwise, it 
is easy to identify all POVMs that can be repro- 
duced on Bob's end with "preserved information" 
[7], an approach that we believe is subtly flawed. 
A preserved measurement M represents perfectly 
preserved information only if there is some circum- 
stance under which Alice would measure M. in or- 
der to answer a question. If M. is inherently noisy 
and error-laden, then for any question Alice might 
ask, there is always some M 1 that would yield a 
better answer. The fact that M. can be reproduced 
by Bob is irrelevant if Alice would never choose to 
make that measurement. 

• The whole idea of a code is appropriate only in 
the communication-theoretic paradigm, not the ob- 
servational one. If the input to the channel is 
controlled by an oblivious system (e.g., a distant 
star) rather than a cooperative sender, then the re- 
ceiver/observer cannot rely on preparation within 
the code. This is correct - and yet the framework 
works nonetheless. If any information is perfectly 
preserved by the channel, then there must be at 
least two input states that remain distinguishable 
at the output. Conversely, if the channel mixes up 
every pair of input states, then there is absolutely 
no question that Bob can answer as well as Alice. 

It is true that the semantic meaning of a "code" is 
inappropriate to the observational paradigm, since 
an oblivious "sender" is unlikely to cooperate by 
carefully preparing within a code. Ultimately, this 
is why we focus not on codes, but on the underlying 
IPS. The existence of a preserved code is merely a 
symptom of the underlying structure; if a code ex- 
ists, then there is potentially an entire equivalence 
class of codes. This is especially true in the case 
of unconditionally preserved information (the only 
kind relevant to observation), where the recovery 
map £ [recall Eq. (11)] does not depend on any 
prior information about the code (e.g., a subspace 



projector P). An unconditionally preserved IPS is 
isometric to a subalgebra that spans the system's 
entire Hilbert space (rather than a subspace V). 
Every observable in this algebra can be observed 
faithfully by the observer at the channel's output. 
Thus, in this situation, the code framework is ancil- 
lary to the real question - but it works nonetheless. 



Appendix B: Proofs 

In this section, we present complete proofs of the tech- 
nical results stated in the main text. 



1. Preserved information is correctable 

Theorem 1. A [convex] code C is correctable for £ if 
and only if it is preserved by £ . 

Proof. The "only if" direction is straightforward. For any 
p, <j G C, any p G [0,1], define the weighted difference 
A = pp — (1 — p)cr. If C is correctable, then there exists 
a CPTP 11 such that, for every such A, ||A||i = \\(K o 
£)(A)||i. The trace norm is contractive under CPTP 
maps [34], so 

\\{nos)mi<\\£mi<mi- 

Combining these two expressions yields ||£(A)||i = ||A||i, 
which means that C is preserved by £ . 

To prove that preservation implies correctability, we 
give an explicit correction operation. This operation is 
known as the transpose channel [35], defined as 

£ v = Uo£^ ojV, 

where V is the joint support of all p G C, II is the projec- 
tion onto V, P is the projector onto V, £^ is the adjoint 
map of £ , and Af is a normalization map given below. If 
the operator sum representation of £ is 

£{p) = Y J E iP El 

i 

then the OSRs for these maps are: 

n(p) = p p p, 

£\p) = ^EjpEi, 



Sv{p) 



S(P)-*pS(P)-*, 



^(PEjSiPr^pfapy^EiP). 



Note that the inverse in £(P) 2 is taken on the sup- 
port of £(P). It is simple to verify that £p is a trace- 
preserving CP map. 

To prove that £j> corrects the code C, we need a couple 
of technical lemmas. The first makes rigorous the notion 
of a channel's action on a subspace: 
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Lemma 1.1. Let £ : B(H) —> B(H f ) be a CP map, and 
Xo be a positive semidefinite operator onH. If X is an 
operator on the support of Xo, then £{X) is an operator 
on the support of £(Xq). 

Proof Both Xo and X are diagonalizable, so Xo has a 
smallest eigenvalue, and X has a largest eigenvalue. Thus 
for some e > 0, Xo > eX, which means that Xo — eX > 0. 
Since £ is CP, £(Xq — eX) > 0. Because it is linear, 
£(Xo) > e£(X). This implies that X is supported on the 
support of Xq. ■ 

Now, recall that discriminating between two code 
states involves a binary (Helstrom) measurement that 
projects onto one of two orthogonal subspaces. Our sec- 
ond lemma states that if a channel £ preserves a code C, 
it also preserves the orthogonality of these subspaces. 

Lemma 1.2. Let £ be a CP map, p and a be states in 
a code C that is preserved by £, and p G [0, 1] . Let us 
write A = pp — (1 — p)a in terms of its positive and 
negative parts, as A = A + — A_ ; where A± are positive 
operators with disjoint supports. Then £ (A+) and £(A_) 
have disjoint supports. 

Proof. The triangle inequality for the trace norm, to- 
gether with the fact that £ is TP, gives 



||£(A)|| 1 = ||£(A + )-£(A_)|| ] 
<||£(A + )|| 1 + ||£(A_ 
= tr(A+)+tr(A_). 



(Bl) 



Because C is preserved, ||£(A)||i = ||A||i = tr(A + ) + 
tr(A_). This implies equality throughout Eq. ( |B1[ ), that 
is, ||£(A+) - £(A_)||i = ||f(A+)|| 1 + \\S(A_W- This 
is possible if and only if £(A+) and £(A_) have disjoint 
supports. ■ 

Armed with these results, we wish to prove that C is 
noiseless for £-p o £. To do so, we will show that for ev- 
ery Helstrom measurement {V+,V-} that distinguishes 
between two states in C, the subspaces V± are invariant 
under £ . First, we prove this for the special case where 
the measurement forms a partition of V (that is, A is 
full-rank) . 



Lemma 1.3. Define £ and A as in Lemma \Tj\ Define 
V± = supp(A±) and P± as the projector onto V±. Then, 
if A is full-rank on V , then T+ and V- are invariant 
subspaces under £-p o £ . 

Proof. £<p is a composition of three CP maps, so £-p o 
£ can be written as a composition of four maps: £p o 
£ = Ho£^oJ\fo£. Let us define the subspaces Q± = 
supp(£ (A±)), and Q± as the projectors onto Q±. We will 
prove the lemma by following the subspaces V± through 
each of the four maps. 

By Lemma |1.1| £ maps every operator on T+ to an 
operator on Q + , and every operator on V- to one on 
Q_. By Lemma [L2| Q± are disjoint. Thus, £ maps V± 
to disjoint subspaces Q±. 



Now we consider M. P± and A± have the same sup- 
port, so £(P±) is supported on Q±. Thus, £(P+) and 
£(P-) have disjoint supports, and because P = P + +P_, 



£{P) 



-1/2 



-1/2 



-1/2 



and so M maps Q + Q + and Q_ — > Q_ . 

Now we consider £t. Using the cyclic property of the 
trace, tr(Q+£(P T )) = implies tr(P T ft(Q ± )) = 0. By 
£^ does not map Q± into which means 



1.1 



Lemma 

that £t maps Q± to V±. 

Thus, ft o A/" o £ maps P± Q ± Q± P±. The 
final projection II has no effect on any operator in V, so 
£ -p o £ maps V± — > P± . ■ 

Lemma [T73] is the core of the proof for Theorem [l] To 
complete the proof, we need to extend it to cases where 
A is not full rank, and therefore {T+,T_} do not form 
a partition of V . 



Lemma 1.4. Lemma \T73\ holds even if A is not full-rank 
on V . 

Proof. There exists a full-rank (on V) state po G C. This 
follows because V is the support of C, and C is convex. 
For any e G (0 ... 1), (1 — e)p + epo is full rank. So 
we consider, in place of p, a sequence of full rank states 
{p' n }, where p' n = (1 - e n )p + e n p , and {e n } converges 
to 0. Lemma |1.3[ applied to the sequence of full-rank 
weighted differences A'^ = pp' n — (1 — implies that 
the corresponding partitions {p'^ 5 ;p^ n ) } are invariant 
subspaces. As n —> oo, A'i n ^ converges to A_, and p^ n ^ 

converges to V- , while converges to the orthogonal 
complement of V- in V. Thus V- is invariant under 
£-p o £. The same argument, but with a replaced by 
a' n = (1 — e n )a + e n po? shows that P + is invariant under 
£j> o£. ■ 



Armed with Lemmas |1.3| and |1.4[ it is now easy to 
prove that C is noiseless for £-po£. Consider an arbitrary 
convex combination of powers of £ , 



p n (£ V o£) n , 



where {p n } is a probability distribution over non- negative 
integers. Let A be a weighted difference of code states. 
By Lemmas |1.3||1.4| the supports of A + and A_ are 
invariant and disjoint subspaces. Since T is trace- 
preserving, 

||.F(A)||i = tr(J-(A + ))+tr(J-(A_)) (B2) 
= tr(A + )+tr(A_) = ||A|| 1 . 

This condition - satisfied for all A - is sufficient for C to 
be noiseless. ■ 
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2. The structure of noiseless codes 

Lemma 2. Every noiseless code C for £ is isometric to 
a set of states that are fixed points of £. 

Proof. Consider the CPTP map 

1 N 
£oo= lim Vr. 

The limit is well-defined for any map on a finite- 
dimensional Hilbert space. Note that £ o = E^, so 
£[£oo(p)} = £oo(p) for any p G C. That is, 6^ projects 
onto the fixed points of £ . Now, if C is noiseless for £ , 
then it is preserved by any convex combination of powers 
of £ , and hence by foo. Since C is preserved by £oo, C is 
isometric to £oo(C) (see Definition [7]) . As noted above, 
£<x>(C) consists entirely of fixed states, so C is isometric 
to a set of fixed states. ■ 

Corollary 3. Every maximum noiseless code for a chan- 
nel £ is isometric to the full fixed-point set of £. 

Proof. Let C be a noiseless code for £ . By Lemma [2j C is 
isometric to a subset of the fixed states. The fixed states 
themselves form a noiseless code C max . If C is isometric 
to a proper subset of the fixed states, then C is strictly 
smaller than C max , and is therefore not maximum. ■ 

A similar result for preserved codes follows from the 
fact that they can be made noiseless (Theorem [TJ. 

Theorem 4. Every maximum preserved code for a 
CPTP map £ is 1-isometric to the full set of fixed states 
for some other CPTP map 1Z o £ . 

Proof. This follows from combining Lemma [2] with The- 
orem [T] and Definition |9l ■ 

These results tell us that maximum preserved codes 
have the same structure as fixed-state sets - but not what 
that structure is. The following theorem fills that gap, 
defining the structure of an arbitrary CPTP map's fixed 
points. It also characterizes the fixed points of the ad- 
joint map £t (defined so that if £(p) = s ^ ji E i pE\ 1 then 
£\p) = EjpEi). This extra result is useful in Section 
[Vl in the algorithm for finding noiseless codes of £ . 

Theorem 5. Let £ be a CPTP map on B(H), and E^ its 
adjoint. Let Fix(£) be the fixed points of£, and Fix(£t) 
the fixed points of E^ . Then, 

(i) Let Vo C H be the support o/Fix(£). Then Vo is 
an invariant subspace under £. 

(ii) Let £p be the restriction of £ to Vo, so £-p = n o 
£ oUo, where Ho projects onto Vo- Then the fixed 
points of £p o form a matrix algebra A. 

(Hi) Fix(£) is a distortion of A. 



(iv) Fix(£t) is a 1:1 extension of A from Vo to H. That 
is, for each X G A, there exists precisely one X' G 
Fix(£t) so that X = U(X f ) = P X f P . 

Proof. First, we will prove that Vo is an invariant sub- 
space under £ , using the following lemma. 

Lemma 5.1. Fix(£) contains a positive, full-rank 
(on Vo) operator, there exists po G Fix(£) ; such that 
(^Ipol^) > for all pure states G Po- 

Proof. Let po = £oo(H), where 11 is the identity on %. 
Since is CP and projects onto fixed points of £ , po 
must be a non-negative fixed point of £ , and hence is in 
Fix(£). Let Q C Vo be the support of po. We want to 
show that Q — Vo- Suppose Q is a proper subspace of Vo- 
Then, there exists in Vo\Q such that (^Ipol^) = 0? 
but there exists X G Fix(£) such that (ip\X\i/j) ^ 0. 
Let Y be one of the four possible Hermitian operators: 
±(X+Xt), ±i(X-X*), chosen so that (ip\Y\i/j) < (this 
must be true for at least one of the four possibilities). 
Since X\ -X and iX are all in Fix(f) if X G Fix(5), Y is 
also in Fix(£), so £oo(Y) = Y. Now consider the operator 
p = 11 + 5Y, where 5 > is chosen small enough so that 
p is non- negative. Then, £oo{p) = Po + SY . However, 
(^Ipl^) < 0, which contradicts the CP property of E^. 
Therefore, 2 = 7^0, and po is the desired positive, full- 
rank fixed operator. ■ 

Applying Lemma to po implies that Vo is an in- 
variant subspace under £ , which proves part (i) of the 
theorem. 

Now, to prove part (ii), we consider £-p = IIo o £ o IIo, 
the restriction of £ to Vo- Its Kraus operators are {Ki} = 
{PoEiPo}, where Po is the projector onto Vo- Since Vo is 
an invariant subspace, i^Po = PqE^Po Vi, which means 
that £-p is TP, i.e. s ^ i K\K i = P . Furthermore, since 
all of £'s fixed points are supported on Po, £v nas the 
same fixed points as £ . 

We can now show that £p 's fixed points must com- 
mute with its Kraus operators. 

Lemma 5.2. For any X G B(Vo), 4> P0 = X if and 
only if [X, Ki] = for all i. 

Proof. If [X,Ki]=0Vi, then 

4 ( x ) = E K l XK * = ( E K l K ) x = p ° x = x - 

i i 

Conversely, suppose 6^ (X) = X. Consider the quantity 
Y J [X,K^[X,K i }=£^X)-X^X, 

i 

after some algebra. By construction, this is non-negative. 
Now, observe that 

Tr{po[4 o (XiX)-XlX\} = Tv{£ Vo (p )X^X} 

-Tv{p X^X} 
= 0, 
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since po is fixed under £ (and hence £-p ). Because po 
is full-rank and positive, for any positive operator Y G 
B(V ), Tt(p Y) = & Y = 0. Therefore, £ ] Vq (X^ X) - 
X^X = 0, and 52.[X,Ki]i[X,Ki] = 0. Since e°very term 
in the sum is non-negative, we conclude that [X, Ki] = 
Vi. (Note: This proof is adapted from a result in [54].) 



Lemma 



5.2 



tells us that the fixed points of £p o are 
precisely the commutant in B(Vq) of the Kraus opera- 
tors {Ki}. Commutants are closed under addition and 
multiplication, and the fixed points of £p are closed un- 
der Hermitian conjugation. Therefore, the fixed points 
of £p o form a matrix algebra, which completes the proof 
of part (ii) of the theorem. 

Let us denote this matrix algebra A. The structure 
theorem for matrix algebras (see Eq. ^ and Ref. 
states that, in some basis, we can write A as 



(B3) 



which induces a natural Hilbert space decomposition: 



In light of the above, £p acts trivially on each of the 
"noiseless" factors factors, but does something non- 
trivial on each of the "noisy" factors factors. Fur- 
thermore, £ acts identically to £p on the Vo subspace, 
but may do anything at all to its complement (including 
mapping states on Vo onto Vo). 

The next step of the proof is to show that Fix(£) is 
a distortion of A. Recall that £ and £p have the same 
fixed points, so we need only characterize the fixed points 
of £p - We will do so by constructing a vector space of 
fixed operators, then showing that this exhausts the fixed 
points of £p . 

Lemma 5.4. Following the notation in Theorem^ let 

A = @M Ak ®K Bk 

k 

be the algebra fixed by £^ . Then there exist positive 
semidefinite operators G B(H Bk ) such that the fol- 
lowing distortion of A, 

A = @M Ak ®{TB k }, 



H = Vo®V 



0(A fc ® B k ) 



'Po- 



rn 



In this basis, we can say something about the Kraus 
operators of £ . 

Lemma 5.3. Given a CP TP map £ on B(H), let Vo 
be the support of its fixedpoints, and A the algebra fixed 
by £p o (as in Theo rem 51). In the decomposition of H 
induced by A (Eq. \B4\/ ^ihe Kraus operators of £ have 
the form: 



Ei 
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for some operators K it k € B(Bk), C{ € B(Po) and Di £ 
B(Vo~,V ). 

Proof. The Ei operators can always be written in the 2x2 
block form given above. Since d and Di are arbitrary, 
we need only show that the upper left block is of the 
given form, and that the lower left block must vanish. 
The upper left block of each Ei is a Kraus operator Ki 
of £p . These are the Hermitian conj ugat es of the Kraus 



5.2) commute with 



operators for £p o , which (by Lemma 
A. Therefore, they must be of the form 



which is the desired form for the upper left block. Finally, 
we observe that the lower left block maps operators on 
Vo to operators on Vo- Since Vo is an invariant subspace, 
this block must vanish. ■ 



consists entirely of operators that are fixed by £. 

Proof. Let X = ^ k Xa u r# fc be an element of A. By 
Lemma 15.31 



£{X) = J2 R i XK i 

i 

= Y. Xa ^{Y. k ^ k U 

k i 



where for each k, £ Bk • B(H Bk ) B(U Bk ) is a CPTP 
map with Kraus operators {Ki^}. Schauder's fixed point 
theorem [27] states that every CPTP map has at least 
one fixed point. If we let be a fixed point of £ Bk1 then 
£{X) =X. ■ 

Now we need to show that A contains all the fixed 
points of £. 

Lemma 5.5. Following the notation in Theorem^ let 
A be defined as in Lemma \5^\ Then every fixed point of 
£ is in A. 

Proof. A is closed under linear combination, so it is a 
vector subspace of B(Vo)- Its dimension is easily calcu- 
lated: 

dim(A) = dim(^l) y^dim(A fe ) 2 . 



Let us view &p and £p o as matrices (L and L\ respec- 
tively) that act on vectors in B(Vo)- Since each element 
of A is fixed by £ , and is therefore an eigenvector of £p 
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with eigenvalue +1, £<p has a +1 eigenspace of dimen- 
sion at least dim (^4). Furthermore, if £ had another fixed 
point outside of A, then £p ' s +1 eigenspace would be 
strictly larger than that. 

Let {Oi} be an orthonormal basis (in the Hilbert- 
Schmidt inner product) for B(Vo). L has matrix elements 
Lij = tr{Oj"£p (Oj)}, and is its Hermitian conjugate. 
The eigenvalues of a matrix and its Hermitian conjugate 
are complex conjugates of each other. Thus, the dimen- 
sions of the -fl-eigenspaces of Ce Vq an d are equal, 



and Fix(£) and Fix(£* 



v ) 



A have the same dimension. 



So £ has no fixed points outside of A. 



A is a distor- 



These two lemmas prove that Fix (5) 
tion of A. 

Finally, let us consider the fixed points of £^ . We begin 
by showing that they are in 1:1 correspondence with the 
fixed points of £p , by showing that PoFix(£t)p = A. 
The first step is relatively straightforward. 

Lemma 5.6. Following the notation in Theorem [5| 
P Fix(£ t)P C A. 



Proof. The Kraus operators of £^ are (by Eq 
Lemma 5.3) 



B5 



Let X be an element of Fix(£t). By writing X in 
block-diagonal form with respect to the decomposition 
U = V © Vo, and noting that £+(X) = £\ E[XE U it is 
straightforward to show that 

4 o (P XP )=P £\X)P , 

and since £\X) = X, we conclude that PqXPq is a 
fixed point of £^ q , and therefore is an element of A. So 
P Fix(£t)p c A. ■ 

Now we need to show that A C P Fix(£^)Po. This is 
a bit more difficult, and requires a technical lemma. Let 
us partition the Hilbert- Schmidt space into subspaces as 
follows: 



JC 


= B(H), 




= B(V ), 


/Co 


= K./K.Q. 



We can write the matrix representing £ in block form as 

Here, L corresponds to the map £, which acts on vectors 
in JC. Lg VQ corresponds to the map &p and maps /Co 
back into itself. Ljr maps /Co back into itself, while Lg 
maps /Co to JCq. Because Vo is an invariant subspace, 
L does not map /Co to /Co- The matrix for £^ is the 
Hermitian conjugate L\. 



Lemma 5.7. Ljr has no fixed points. 

Proof. Suppose there exists X G /Co such that Lj?(X) 
X. Define Y = L Q [X). Then 



and the action of £ n on the operator corresponding to 
(£) is given by 



V n_1 T 
2^m=0 L 

X 



If Y is orthogonal to the subspace Fix(£), then as n —> oo, 
the sum converges to 



lim (L £ ) n 



(11 



■«Po) 



This is a fixed point of £ not contained in Fix(£), which 
contradicts the definition of Fix(£). On the other hand, 
if Y is not orthgonal to Fix(£), then the sum diverges as 
n — >• oo. This implies that £ is non-contractive, which 
violates complete positivity [34]. So, either way, we have 
a contradiction. ■ 



Using Lemma [577} we can show that every fixed point 
of £p has an extension to a fixed point of £^\ 

Lemma 5.8. Let Xq G A be a fixed point of £p Q . Then 
there exists a fixed point X G B(H) of £^ such that 
PqXPq — Xq. 

Proof. Both Xo and X are vectors in the Hilbert- Schmidt 
space JC = B{T~L). Using the decomposition JC = /Co /Co, 
we can write Xq in block form: 



In this block form, we choose 

Xo 

v(% n "4)- 1 4^o 



^0 



X = 



Note that Ljr has no fixed points (by Lemma 5.7), so 
H^ o — jCj: is invert ible, which means that X is well- 
defined. Furthermore, PoXPq = Xq by construction. To 
show that X is a fixed point of £\ we simply compute 



L\X) = 




l L};Xo 



lUx ) 



X 

,(% - 4) _1 4*o, 



X. 
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Lemma 5.8 im plies that A C PoFix(£t)Po. Combining 
this with Lemma 5.6 , we conclude that A = P Fix(£ t)P , 
which completes the proof of Theorem [5] ■ 

Now, we want to show that £'s noiseless codes have a 
rigid structure dictated by the fixed points. 

Lemma 6. Let £ : B(H) — » S(%) 6e a CP map a 
full-rank fixed point, whose fixed points induce (see The- 
orem^ the decomposition 



H = (${A k ®B k ). 



Then C is a [convex] maximum noiseless code for £ if 
and only if C comprises all states of the following form 



k 
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where the pA k are arbitrary states on Ak and each is 
a fixed (i.e., the same for all p) state on B^. 

Proof. If C has the given structure, then: 

1. It is maximum, since it is isometric to the full set 
of fixed states of £. 

2. It is noiseless, because £ leaves the states on sub- 
system Ak intact, and every pk state has the same 
noise- full state pk- So £ preserves all the weighted 
1-norm distances between code states. 

To show the converse, we must show that if C is not of 
this form, then it is not maximum noiseless. If C is not 
of this form, then either 

1. It contains only a strict subset of the states given 
above; or, 

2. It contains at least one state with correlations (off- 
diagonal elements) between different /c-sectors; or 

3. It contains at least one state with correlations be- 
tween Ak and or 

4. It contains states that differ on B^. 

If C is a strict subset, then it is obviously not maximum. 

The key to proving the converse is showing that the 
condition for noiselessness (Definition |8| forbids correla- 
tions between the /c-sectors as well as between Ak and 
Bk- The proof relies both on convexity and on the code 
being maximum. First, recall the map £oq from Lemma 
[2j which projects onto the fixed point set Fix(£). Given 
the structure of Fix(£), the CPTP ^ must act on states 
on Vo as: 



£co(p) = (tr Bk {PkpPk} ® 



where r# fc is the fixed state on Bk from Theorem 5 
and Pk projects onto the kih sector. From Lemma 2 
we know that for every fixed state of the form pf = 
©/e^Afc ^^Bfc), there exists exactly one c ode state p G C 
such that £oo(p) = Pf- From Eq. (B8), this demands 
ti Bk {PkpPk} = cr Ak for all k. 

Now, focus on the case with only two /c-sectors, labeled 
1 and 2. Consider two fixed states in these sectors with 
block-diagonal form: 



Pfi 



P'n 




Pf2 





P'f2 



The two code states that are isometric to the fixed points 
must respectively be of the form 



Pi 



Pi 




P2 





P f 2 



By convexity of C, any convex combination of p\ and p2 
must also be in C. This excludes from C any state with 
on-diagonals equal to this convex combination, but non- 
zero off- diagonals, since the two different states will have 
the same image (and hence indistinguishable) under £oo. 
Generalizing this to any number of /c-sectors, we find that 
any code state in C must be block-diagonal: p = fc p' k . 

Next, consider the state p' k for the /cth sector. We 
need to show that only product states of Ak <£) Bk are 
allowed. We first consider a fixed state p'f on this sec- 
tor of the form \^){^\A k ® T# fc . Since the state on Ak 
is pure, the corresponding code state whose image un- 
der £ 00 is p'j must also be pure on A^. It is hence a 
product state of the form |V ; )(V ; U fc ® PB k - Next, suppose 
Pf — °~A k ® TB k , where GA k is in general a mixed state 
writable as OA k = J2 a Qa\^a)(^a\A k ' Now, each state 
\^a)(^a\A k ® TB k ,o is a fixed state, with corresponding 
code state p' k a = \ip a )(ipa\A k ® PB kj a- By convexity, the 

state J2 a q a p'k a ^ s a ^ so m ^ ano - ma P s to pf = GA k <8> ^B k 
under £ OQ . This excludes from C any other state with 
non-zero correlations between Ak and P/c, but with the 
reduced state on Ak equal to <j^ fc . Furthermore, we must 
have that PB k ,a — PB k in order for the (1-norm) dis- 
tinguishability between the p' k a 's to remain unchanged 
under £00. Therefore, p' k must be of the form &A k ® PB k 
for some pB k - B 

We knew already that noiseless codes are isometric to 
fixed states (Lemma[2| and that fixed states are isometric 
to algebras (Theorem |5|. Now we know explicitly what 
these codes look like. The isometry is very similar to the 
one between the fixed states (Fix (5)) and the underlying 
algebra A: A noiseless code is obtained from Fix(£) just 
by changing the state of the noise- full factors 12 . 
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12 Since C only contains states, we are really restricting to the posi- 
tive trace-1 operators in A4 k within Fix(£) and A. This is what 
we mean by "C is isometric to a matrix algebra." 
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Finally, it follows from this lemma that not only can 
we make preserved codes noiseless, but we can also make 
them fixed. 

Corollary 7. For every maximum preserved code C, 
there exists a CPTP map 1Z such that 7Z o S(p) = p for 
all states p G C. 

Proof. From Theorem [I] we know that every preserved 
code C is correctable, so there exists a recovery map TZq 
such that C is noiseless for 1Zqo£ 1 and 1Zqo£ is unital. By 
Lemma [6j C contains states all of the form p = ^2 k (pk ® 
lik)- Now let 1Z = T°7Zq, where T does nothing to the Ak 
subsystems, but replaces the state of each subsystem 
with pk- (Constructing such a map is simple, and it is 
manifestly CPTP). Now, every p G C is a fixed state of 
Ko£. ■ 



3. Finding preserved IPS is hard 

Lemma 8. The problem of finding the largest preserved 
IPS for an arbitrary channel £ : B(Hd) — ^ B(7~Ld 2 ) that 
maps a d-dimensional system to a d 2 -dimensional system 
is at least as hard as the NP- complete problem MAX- 
CLIQUE. 

Proof. The proof is straightforward, and proceeds in 
three steps. First, we review a known result connect- 
ing classical channels with graphs. Second, we show that 
finding the largest code for a certain set of classical chan- 
nels is equivalent to MAX-CLIQUE. Third, we observe 
that the classical channels can be embedded in quantum 
channels. 

1. A classical channel £ c maps a set of input symbols 
{1 . . . N} into mixtures of a set of output symbols 
{1 . . . M}. For each input symbol n, its image T(n) 
is the set of output symbols to which £ maps it 
with nonzero probability. A set of input symbols 
C = {ni . . . nk} is a preserved zero-error code for £ 
if and only if the images of all the rtj are disjoint - 
i.e., it is possible to unambiguously identify which 
of the input symbols was sent. We can define the 
channel's adjacency graph G (see Example [6| as 
follows: The vertices are labeled by input symbols 
{1 . . . TV}, and two vertices {n, m} are connected by 
an edge if and only if the images X(n) and T(m) are 
overlapping. Now, a code C is a subgraph of G, and 
it is preserved if and only if no two of its vertices 
are connected - i.e., if it is an independent set of G. 
The largest code is a maximum independent set of 
G. An independent set for G is a clique for its dual 
graph G', and finding the maximum clique for an 
arbitrary G' is a well-known NP-complete problem 
called MAX-CLIQUE. 

2. We haven't yet shown that finding a classical chan- 
nel's largest code is NP-complete - perhaps all 
channel's adjacency graphs are easy instances of 



MAX-CLIQUE? This turns out not to be the 

case; any graph H can be the adjacency graph of 
a classical channel. Let H be a graph with ver- 
tices {1 . . . d}, and let £ be a classical channel from 
{1 . . . d} {1 . . . d 2 }, defined as follows: 

(a) The d input symbols are denoted v G {1 . . . d}, 
and the d 2 output symbols are denoted by or- 
dered pairs u G {1 . . . d} x {1 . . . d}. 

(b) For each input symbol v G {1 . . . d}, £ maps 
v (with nonzero probability) to each of the d 
output symbols {(v : x) : x = 1 . . . d}. 

(c) For each input symbol u, £ maps each input 
symbol v to output symbol (v',v) if and only 
if H contains the edge (v',v). 

Note that each output symbol (a, b) can be pro- 
duced by at most two input symbols (a and b). So, 
if two input symbols v and v' are connected in H, 
then £ maps both of them to the output symbol 
(V,v), and so they are connected in the adjacency 
graph G. But, if they are not connected in H, then 
they are not mapped to the same output symbol, 
so they are not connected in G. Ergo, G = H, and 
any graph can be produced as the adjacency graph 
of a channel. 

3. Finally, we need to show that for each such graph, 
we can construct a quantum channel. This is rather 
easy. Let the input space be %d and the output 
space be %d 2 - Let {|1) , . . . , |d)} be a basis for Hd- 
Then the £ we will consider acts as follows: First, 
it dephases in the given basis (i.e., measures it); 
and then it acts as the classical channel above. 



4. Unitarily noiseless codes 

The analysis of unitarily noiseless codes follows closely 
that of the noiseless codes. The rotating points of £ 
replace its fixed points, with a CPTP map that projects 
onto their span playing the role that £ ^ does for noiseless 
codes. 

Lemma 9. If C is a maximum unitarily noiseless code 
for a CP map £, then C is isometric to the set of all 
(positive trace- 1) states in the span of the rotating points 
of £. In other words, there exists a map £i n f such that 
lb £ inf(p) ~ (1 - p)£ in f((r)\\i = \\PP~ (1 -p)o-\\i for any 
p,cr G C, p G [0,1], and £i n f(p) an d £inf(o~) are in the 
span of the rotating points of £. 

Proof. By Definition [l7| a rotating point X of £ is a linear 
combination of operators Xj~ such that £(Xj £ ) = e^ fc X/ c . 
Let Rot(£) be the complex span of all rotating points of 
£ . It is convenient to move to the Hilbert-Schmidt space, 
where Rot(£) can be viewed as a subspace spanned by 
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the vectors corresponding to the rotating points. Clearly, 
Rot(£) is an invariant subspace under the linear map £ , 
in the sense that any vector in Rot(£) gets mapped under 
£ to another vector in Rot(£). Let £r denote £ restricted 
to Rot(£). We view £ and £r as matrices acting on 
vectors in the Hilbert- Schmidt space. 

Even though £ may not be a diagonalizable matrix, we 
can still write it in the Jordan normal form [55]: There 
exists an invertible matrix S such that £ = SJS -1 , where 
J is the matrix J = diag[Ji, J2, . . . , Jk]- Each Jk is 
called a Jordan block, and it is zero except on the diagonal 
and first-off-diagonal: 



( X k 1 



V 



\ 



1 
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The Jordan form for £ is unique up to permutation of the 
Jordan blocks. Note that any vector \v) is an eigenvector 
of J if and only if S\v) is an eigenvector of £ . 

Lemma 9.1. For any k, the support of Jk contains ex- 
actly one unit eigenvector of £. The corresponding eigen- 
value is Afc. 

Proof. Let {|^a^)}™=i be the ordered basi s for the sup- 
port of Jk in which Jk takes the form Eq. (B9). Clearly, 

Jk\v[ h>} ) = Xk\v[ k>} ), so S\v[ k>} ) is an eigenvector of £ with 
eigenvalue A&. To show that this is the only eigenvec- 
tor in this Jordan block, let \v) = Yl a ^ a \ v ^) ^ e a 
vector in the support of From the form of Jk in 
Eq. (B9), it is easy to see that the coefficients {p a } 
satisfy the equation Jk\v) = a\v) for some constant a 
only if fia+i = (a — Xk)/i a for a = 1, . . . , m — 1, and 
(a — Afc)/i m = 0. The only non-trivial solution is a = Xk 
and fii 7^ 0, /i a >i =0. ■ 
This lemma tells us that the rotating points of £ 
are mutually orthogonal, unless there are degenerate 
eigenspaces of rotating points. In that case, we can still 
pick an orthonormal basis for each degenerate eigenspace 
(already done in the Jordan normal form), and these 
bases, together with the non-degenerate rotating points, 
form an orthonormal basis of rotating points for Rot(£ ). 
We denote this basis as {X{\. £r is diagonal in this basis, 
with entries e t( ^ l (= A/). Note that, for any CPTP map 
£ , the following lemma from [56] holds: 

Lemma 9.2. Any eigenvalue X of £ must satisfy \X\ < 1. 



This, together with Lemma 9.1 , implies that |A&| < 1 Vfc. 

Next, consider powers of £. £ n can be written us- 
ing the Jordan normal form as SJ n S~ 1 where J n = 
diag[J{\J 2 V..,J£] 
triangular matrix: 



•> Jr\ w ^h each JJ} being an upper- 



Jh — 






V 



\ 



(BIO) 



Using the form of J£ in Eq. (BIO), we can show the 
following fact about the rotating points of £ : 

Lemma 9.3. Any (non- degenerate) rotating point of £ 
must occur in a 1- dimensional Jordan block. 

Proof. (This proof follows ideas from [56] for the proof 
of Lemma |9.2| ) Suppose there exists a rotating point 
X such that it belongs to some m x m Jordan block 
Jk with m > 1. Let {X^}™ =1 be an operator basis 
for the operators in the support (as vectors) of J/., with 



X. Consider the completely mixed state pu 



X 

11/ d (d is the dimension of the Hilbert space). Let a 
be some operator in the span of {X^}™ =2 and consider 
the operator p = pu + V°~ where n is a positive number 
chosen small enough so that p is positive. Applying £ n to 
p gives £ n (p) = £ n (pu) + r]£ n ((j). Since £ is TP, £ n (pv) 
remains finite. However, since X is a rotating point, 
we know that |A/~| = 1, and the entries of JJ} grows in 
amplitude as n increases, and hence the entries of £ n (cr) 
(viewed as a vector) grow in amplitude. For large enough 
n (77 fixed), there will be a choice of a such that £ n (p) 
is no longer positive semidefinite. But this violates the 
assumption that £ is a CPTP map. Hence, we must have 
that m = 1. ■ 
Lemma [973] tells us that any Jordan block Jk with m > 1 
must have \Xk\ < 1. 

Now, let {Yp} be an operator basis for operators out- 
side of Rot(£). Y#'s are the operators occurring in Jordan 
block s wit h \Xk\ < 1, and hence lim^oo £ n (Yp) = since 
Eq. dmo] ) tells us that lining J% = if |A fc | < 1. We 
can use {Xi} \J{Yp} as an operator basis for 6(H), and 
write any operator A G B(%) as A 
Then, 



hm S n (A)= lim (V^^^I^ + V^l^)) 
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assuming the limit lim n ^ 00 (^) n (X^) exists for all L 

To work out what lim n ^ 00 (£ft) n (JQ) is, we need the 
following lemma: 

Lemma 9.4. For every e > ; £/iere exists some N e G N 
swc/i tfta^ || (£R) Ne — 11.R 1 1 < e ; where 11r is the identity 
operator on Rot(£). 

Proof. Recall that f r is a diagonal matrix, with entries 
e** 1 , I = 1, . . . , M, where M = dim(Rot(5 )). Therefore, 
(£r) u is also diagonal, with entries e m( ^, and in partic- 
ular (£r)° = 11. The set of all such matrices forms an 
n-torus with a finite volume (2tt) m . Each (£r) u is sur- 
rounded by an e- neighbor hood A/" n , containing all matri- 
ces X on the torus such that ||(f^) n — X|| < e. Each such 
neighborhood has volume at least e M , and so if we con- 
sider the neighborhoods of (£r) u for n = . . . (27r/e) M , 
then at least one pair must overlap. Denote the pair with 
overlapping neighborhoods 
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If <j>i& are all rational multiples of 27r, i.e., cf>i = ^p-, 
Puqi G N, then choosing 7V e to be the lowest common 
multiple of all q\ works. 

Otherwise, a more complicated analysis is required. 
To have \\(£ R ) N * - H R \\ = max, | exp(iiV e ^) - 1| = 
2max^ | sin(A^ e 0//2)| < e, it suffices to demand 
N e (j>i (mod 2tt) < e for all I. Consider the point 
(n0i(mod 27r), . . . , n</>M (mod 27r)), where we always take 
the smallest non-negative value of n^(mod 2ir). As n 
increases from 0, this point traces out a trajectory on 
the surface of an M-dimensional torus. If there is at 
least one <f>i that is a rational multiple of 27r, this tra- 
jectory will eventually close upon itself, and the path 
length of the trajectory is finite. If there is no such 
0/, the trajectory will cover the surface of the torus, 
which has finite area (since it is finite-dimensional) . Con- 
sider hyperspheres of (Euclidean) diameter e centered at 
(n0i(mod 27r), . . . , n^M (mod 2tt)) for each n G Dsl. Be- 
cause the trajectory either has finite length or traverses a 
space of finite area, some of these hyperspheres will even- 
tually overlap, that is, there exists finite r and s > r such 
that the hyperspheres centered at points with n = r and 
n = s overlap. The distance between the centers of the 
overlapping hyperspheres is — r)^(mod 2tt)] 2 < 

e, which implies that (s — r)0/(mod 2tt) < e for all I. 
Therefore, we can choose N e = s — r. ■ 
We can view the limit lim n ^ 00 (^) n equivalently as 



the limit lim n ^ 00 (f^ 



\N e n 



Intuitively, provided we 



11r. More precisely, we can write (£R) Ne = 11r+£? where 
Q e is some map (need not be CP) on Rot(£) such that 
\\G e \\ < e. Now consider the map (£ R ) N * n = (H i? + £ e ) n = 
ELo (m)^e m > for n G N, which gives 



\\(£r) 



N e n 



rn=l 



y<E (")ll^ m H<e(2 n -l). (B12) 



Let us choose e = 3 _n (actually, e = Cq U for any choice 
of Co > 2 works). Then taking the limit n — »• oo of Eq. 
( |B12[ ), we conclude that lim n ^ 00 (f jR ) iVen = 11/ 



From this, we see that Eq. (Bll) can be rewritten as 



lim £ n (A) 



i 



a t Xi G Rot(£). 
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choose e to decrease fast enough, this should converge to 



Therefore, £T inf = lim n ^ oc £ nNe (with e depending on n 
as above) is the projection onto Rot(£). Since a unitarily 
noiseless code is preserved under any power of £, it must 
be preserved under £ in f, which gives the desired isometry 
condition. ■ 

Note that £\ n f is CPTP simply because £ is CPTP, and 
the set of CPTP maps on a finite-dimensional Hilbert 
space is closed under composition. Furthermore, it 
projects every operator onto the span of the rotating 
points of £ . Observe that Rot(£) is precisely the set 
of fixed points of & lT ^. 
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